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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g. , cytokines, such as 
lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (Le., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 
"indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural siipilarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1 350. The polypeptides sequences are designated SEQ 
ID NO: 1351-2700. The nucleic acids and polypeptides are provided in the Sequence Listing. In 
the nucleic acids provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is 
thymine; andN is any of the four bases. In the amino acids provided in the Sequence Listing, * 

1 0 corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1 - 1 3 50 under stringent hybridization conditions; 
nucleic acid sequences which are allelic variants or species homologues of any of the nucleic acid 
sequences recited above, or nucleic acid sequences that encode a peptide comprising a specific 

15 domain or truncation of the peptides encoded by SEQ IDNO:1-1350 . A polynucleotide 

comprising a nucleotide sequence having at least 90% identity to an identifying sequence of SEQ 
ID NO: 1-1350 or a degenerate variant or fragment thereof. The identifying sequence can be 100 
base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1-1350. The sequence information can be a 
segmentof any one of SEQ ID NO:1-1350 that uniquely identifies or represents the sequence 
information of SEQ ID NO: 1-1 3 50. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array . In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -1 350 or novel 
segments or parts of the nucleic acids of the invention are used as primers in expression assays that 
5 are well known in the art. In a particularly preferred embodiment, the nucleic acid sequences of 
SEQ ID NO: 1-1 350 or novel segments or parts of the nucleic acids provided herein are used in 
diagnostics for identifying expressed genes or, as well known in the art and exemplified by Vollrath 
et al ., Science 258 :52-59 (1 992), as expressed sequence tags for physical mapping of the human 
genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 - 1 350; a 
polynucleotide comprising any of the full length protein coding sequences of SEQ ID NO: 1 - 1 350; 
and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding 
sequences ofSEQ ID NO: 1- 1350. The polynucleotides ofthe present invention also include, but 

15 are not limited to, a polynucleotide that hybridizes under stringent hybridization conditions to (a) 
the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1-1 350; (b) a 
nucleotide sequence encoding any one of the amino acid sequences set forth in the Sequence Listing 
(e.g. , SEQ ID NO: 1351 -2700); (c) a polynucleotide which is an allelic variant of any 
polynucleotides recited above; (d) a polynucleotide which encodes a species homolog (e.g. 

20 orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of any of the polypeptides comprising an amino acid 
sequence set forth in the Sequence Listing. 

The isolated polypeptides ofthe invention include, but are not limited to, a polypeptide 
comprising any ofthe amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with 

biological activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence 
set forth in SEQ ID NO: 1-1 350; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridizationconditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents" thereof (e.g. , with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

1 0 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

15 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 

20 expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 

25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

35 expression or biological activity. 



WO 01/57188 



PCT/US01/03800 



The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, b 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
10 invention in a sample comprising contacting the sample with a compound that binds to and form 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
15 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 

20 (ie. , increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compound; 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 

25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a 
polypeptide/compound complex, wherein the complex drives expression of a reporter gene 
sequence in the cell; and detecting the complex by detecting the reporter gene sequence 
expression such that if expression of the reporter gene is detected the compound the binds to a 

30 polypeptide of the invention is identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases o 
disorders as recited herein comprising administering compounds and other substances that 

35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2). If no homology is set forth 
for a sequence, then the polypeptides and polynucleotides of the present invention are useful for 
a variety of applications, as described herein, including use in arrays for detection. 

4, DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and 'the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term primordial germ 
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cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
5 not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 

10 sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

15 "oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 

20 (U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 

25 regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 

30 most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 

35 nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
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be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mKNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
5 lDNOs:l-1350. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
1 0 art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

1 5 The nucleic acid sequences of the present invention also include the sequence 

information from the nucleic acid sequences of SEQ ID NO:1-1350. The sequence information 
can be a segment of any one of SEQ ID NO:1-1350 that uniquely identifies or represents the 
sequence information of that sequence of SEQ ID NO:1-1350. One such segment can be a 
twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in 

20 the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set 
of chromosomes. Because 4 possible twenty -mers exist, there are 300 times more twenty-mers. 
than there are base pairs in a set of human chromosomes. Using the same analysis, the 
probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 
5, When these segments are used in arrays for expression studies, fifteen-mer segments can be 

25 used. The probability that the fifteen-mer is fully matched in the expressed sequences is also 

approximately one in five because expressed sequences comprise less than approximately 5% of 
the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer . The probability that the twenty-five mer would appear in a human genome 

3 0 with a single mismatch is calculated by multiplying the probability for a full match (1-^4 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
15 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 1 50 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
25 The term 'translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant"(or "analog'') refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g. y 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 

1 0 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in it's natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g. , microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g. , yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli 9 will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosornally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is 
expressed in a suitable host cell. "Secreted" proteins include without limitation proteins secreted 
wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are 

20 expressed. "Secreted" proteins also include without limitation proteins that are transported 
across the membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to 
include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, 
P.A. and Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells 
(e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1 X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleorides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 

10 35% {i.e., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

15 listed sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of this 
embodiment, by no more than 20% (80% sequence identity) and in a further variation of this 
embodiment, by no more than 10% (90% sequence identity) and in a further variation of this 
embodiment, by no more that 5% (95% sequence identity). Substantially equivalent, e.g., 

20 mutant, amino acid sequences according to the invention preferably have at least 80% sequence 
identity with a listed amino acid sequence, more preferably at least 85% sequence identity, more 
preferably at least 90% sequence identity, more preferably at least 95% identity, more preferably 
at least 98% identity, and most preferably at least 99% identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 

25 account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% sequence identity, more preferably at least about 85% sequence 
identity, more preferably at least about 90% sequence identity, and most preferably at least about 
95% identity, more preferably at least about 98% sequence identity, and most preferably at least 

30 about 99% sequence identity. For the purposes of the present invention, sequences having 

substantially equivalent biological activity and substantially equivalent expression characteristics 
are considered substantially equivalent. For the purposes of determining equivalence, truncation 
of the mature sequence {e.g., via a mutation which creates a spurious stop codon) should be 
disregarded. Sequence identity may be determined, e.g., using the Jotun Hein method (Hein, J. 
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(1990) Methods Enzymol. 183:626-645). Identity between sequences can also be determined by 
other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 
5 The term "transformation" means introducing DNA into a suitable host cell so that the 

DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

10 As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 

which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 

1 5 with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

20 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:1-1350 ; a polynucleotide encoding any one of the peptide 

25 sequences of SEQ ID NO: 1351 -2700; and a polynucleotide comprising the nucleotide sequence 
encoding the mature protein coding sequence of the polypeptides of any one of SEQ ID 
NO: 135 1-2700. The polynucleotides of the present invention also include, but are not limited to, 
a polynucleotide that hybridizes under stringent conditions to (a) the complement of any of the 
nucleotides sequences of SEQ ID NO:1-1350 ; (b) nucleotide sequences encoding any one of the 

30 amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotide recited above; (d) a polynucleotide which encodes a species 
homolog of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of the polypeptides of SEQ ID NO: 1351-2700. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

35 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
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domains, or combinations thereof; domains in ixnmnnoglobnlin-like proteins include the variable 
immunogJobulin-Jike domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

5 The polynucleotides of the invention include naturally occurring or wholly or partially 

synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 
1 0 herein. The corresponding genes can be isolated in accordance with known methods using the 

sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 
be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
1 5 corresponds to any of the polynucleotides of SEQ ID NO: 1 - 1 350 can be obtained by screening 
appropriate cDNA or genomic DNA libraries under suitable hybridization conditions using any of 
the polynucleotides of SEQ ID NO: 1 -1 350 or a portion thereof as a probe. Alternatively, the 
polynucleotides of SEQ ID NO: 1-1350 may be used as the basis for suitable primer(s) that allow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 
20 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

25 The polynucleotides of the invention also provide polynucleotides including nucleotide 

sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, and even more typically at 

30 least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 350, or complements thereof, which fragment is greater than about 5 
nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most 

35 preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more that 
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are selective for (i.e. specifically hybridize to any one of the polynucleotides of the invention) 
are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
the same family of genes or can differentiate human genes from genes of other species, and are 
5 preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO : 1 - 1 3 50, a 
representative fragment thereof, or a nucleotide sequence at least 90% identical, preferably 95% 

10 identical } to SEQIDNO:1-1350 with a sequence from another isolate of the samespecies. 

Furthermore, to accommodate codon variability , the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, 
in the coding region of an ORF, substitution of one codon for another codon that encodes the same 
amino acid is expressly contemplated 

1 5 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1 - 1 350, can be obtained by searching a database using an algorithm or a 
program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool is used to 
search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and Altschul 
S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search against 

20 Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

25 The invention also encompasses allelic variants of the disclosed polynucleotides or 

proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed fo sequences which 
30 encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
35 polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
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acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices {e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 . 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 
slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Cwrent 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
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4 of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

10 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-1350, or functional 
equivalents thereof, may be used to generate recombinant DNA molecules that direct the 
expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. Also 

1 5 included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning; A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

20 plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 

25 vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1 -1 350 or a fragment thereof or any other 

30 polynucleotides of the invention. In one embodiment, the recombinant constructs of the present 
invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having 
any of the nucleotide sequences of SEQ ID NO: 1-1 350 or a fragment thereof is inserted, in a 
forward or reverse orientation. In the case of a vector comprising one of the ORFs of the'present 
invention, the vector may further comprise regulatory sequences, including for example, a 

35 promoter, operably linked to the ORF. Large numbers of suitable vectors and promoters are 
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known to those of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of example. 
Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, 
pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). 
5 Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI ? pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., 
Nucleic Acids Res, 19, 4485-4490 (1991), in order to produce the protein recombinant^. Many 

10 suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 

15 (transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7, Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HS V thymidine 

20 kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of £. coll 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 

25 transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3 -phosphogly cerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 

30 periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacteria] use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 

35 signals in operable reading phase with a functional promoter. The vector will comprise one or 
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more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 

5 employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 

10 Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 

15 additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. . 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et al., Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 

20 against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



25 4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-1 350, or fragments, analogs or derivatives thereof. An n antisense n 
nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid 

30 encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA 

molecule or complementary to an mKNA sequence. In specific aspects, antisense nucleic acid 
molecules are provided that comprise a sequence complementary to at least about 10, 25, 50, 
100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of SEQ ID 
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NO: 1 351-2700 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 
NO:1-1350 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (z.e., also referred to as 5 r and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ 
ID NO: 1-1 350), antisense nucleic acids of the invention can be designed according to the rules 
of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 
complementary to the entire coding region of a mRNA, but more preferably is an oligonucleotide 
that is antisense to only a portion of the coding or noncoding region of a mRNA. For example, 
the antisense oligonucleotide can be complementary to the region surrounding the translation 
start site of a mRNA. An antisense oligonucleotide can be, for example, about 5 S 10, 15, 20, 25, 
30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis or enzymatic ligation reactions using procedures known in 
the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 
chemically synthesized using naturally occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical stability 
of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate 
derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1 -methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-rnethylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine 5 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
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antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

5 The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

1 0 an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systernically. For example, for systemic administration, antisense molecules can be modified 

1 5 such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. , 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 

20 control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
-a n omeric nucleic acid molecule. An -anomeric nucleic acid molecule forms specific 
double-straiided hybrids with complementary RNA in which, contrary to the usual -uni ts, the 
strands run parallel to each other (Gaultier et al. ( 1 987) Nucleic Acids Res 1 5 : 6625-664 1 ). The 

25 antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al 

(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al (1987) 
FEBSLett 21 5: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

30 In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 

Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 

35 translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
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designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
1350). For example, a derivative of a Tetrahymena L-l 9 IVS RNA can be constructed in which 
the nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. No. 4,987,071 ; and Cech et 
5 al U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be used to select a catalytic RNA 
having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel et ah, 
(1993) Science 261:141 1-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 

10 structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al (1992) Ann. NY. Acad Set 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 

15 solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 

20 backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Peny-O'Keefe etal (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 

25 example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

30 primers for DNA sequence and hybridization (Hyrup et al. (1 996), above; Perry-O'Keefe (1 996), 
above). 

In another embodiment, PNAs of the invention can be modified, e,g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
35 delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
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combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

5 the nucleobases, and orientation (Hyrup (1 996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5'-(4-methoxytrityl)amino-5 -deoxy-thymidine phosphoramidite, can be used between the PNA 

10 and the 5' end of DNA (Mag et al (1 989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1 975) Bioorg Med Chem 
LettS: 1119-11124. 

1 5 In other embodiments, the oligonucleotide may include other appended groups such as 

peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl Acad. Set U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad Set 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 

20 oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

25 

4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
30 methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 

polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
3 5 increase, expression of endogenous polypeptide. Cells can be modified (e.g. , by homologous 
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recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA {e.g. , ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional maimers to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as R coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
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cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable prompter and also any necessary ribosome binding sites, polyadenylation 
5 site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 

10 more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 

1 5 agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 

20 strains include Escherichia coli> Bacillus subtilis, Salmonella typhimurium, or any bacterial 

strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

25 In another embodiment of the present invention, cells and tissues may be engineered to 

express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 

30 gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 

regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 

35 targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
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sites, leader sequences for enhancing or modifying transport or secretion properties of the 
protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
5 gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 

enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 

10 the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into die host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 

1 5 more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 

20 Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 

25 PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

30 The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

comprising: the amino acid sequences set forth as any one of SEQ ID NO: J351-2700 or an 
amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO: 1-1 350 or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides preferably with biological or immunological activity that are encoded by: (a) a 

35 polynucleotide having any one of the nucleotide sequences set forth in SEQ ID NO: 1 -1 350 or (b) 
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polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO:1351- 
2700 or (c) polynucleotides that hybridize to the complement of the polynucleotides of eather (a) 
or (b) under stringent hybridization conditions. The invention also provides biologically active 
or immunologically active variants of any of the amino acid sequences set forth as SEQ ID 
NO 1351-2700 or the corresponding full length or mature protein; and "substantial equivalents" 
thereof ie g. with at least about 65%, at least about 70%, at least about 75%, at least about 80%, 
at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at 
least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 
99% amino acid identity) that retain biological activity. Polypeptides encoded by allelic vanants 
may have a similar, increased, or decreased activity compared to polypeptides compnsmg SEQ 
IDNO:1351-2700. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described m H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of suchprotein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable earner, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic add 
fragments of the present invention or by degenerate variants of me nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
35 acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

1 5 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography ? HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al. s Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 1 00 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains* 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO: 1351-2700. 

1 5 The protein of the invention may also be expressed as a product of transgenic animals* 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scaiming method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for bacuiovirus/insect cell expression 

systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

10 invention is "transformed/* 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP) ? glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N. J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn,). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g. , targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J. s et al., Nucleic Acids Research 12(1};387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al, Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Moi Biol, 1 57, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. MoL 
Biol 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (Le., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell> to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g., cancer as well as modulating (eg., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety {e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4,8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

10 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1 990); and Miller, Nature, 357: 455-460 (1 992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge ofDNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publications. WO 94/12650,PCT International PublicationNo. WO 92/20808,and PCT 

1 0 International Publication No. WO 9 1 /0995 5 . It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DN A (e.g. , ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 

1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene 5 s existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 

35 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DN A, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
- U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by Selden et al.; and International Application No. PCT/US 90/06436 

1 5 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

3 0 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described Ln U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

10 indirectly activate or inhibit the polypeptides of the invention (identified, e.g. , via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-iigand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed. 5 Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 

15 and T. Manialis eds., 1 989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques 11 , Academic Press, Berger, S. L. and A. R. Kimmel eds., 3987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DAIG, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology. Ed by J, E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 

10 Humans); Takai et al., J. Immunol. 1 37:3494-3500, 1986; Bertagnolli et al, J. Immunol. 

145:1706-1712, 1990; Bertagnolli etal., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
etal.,L Immunol. 149:3778-3783, 1992; Bowman etal., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomiy, K. } Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 3, 1991; Moreau et al., Nature 336:690^692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.SA. 80:2931-2938, 1983; Measurement of mouse 

25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Nad. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp, 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 

30 9— Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 199 1. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 

35 Immunology, Ed by J. E. Coligan, A, M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al. s Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1 :405-41 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharrnaceuticals and the development of bio-sensors. The ability to produce 

15 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MI?- 1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic 
fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance* 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

15 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating 
utility, for example, in treating various anemias or for use in conjunction with 

1 5 irradiation/chemotherapy to stimulate the production of erythroid precursors and/or erythroid 
cells; in supporting the growth and proliferation of myeloid cells such as granulocytes and 
monocytes/macrophages (i.e., traditional CSF activity) useful, for example, in conjunction with 
chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and 
proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in place 
of or complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1 , 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R I. Freshney, et al eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J, In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the 
repair of congenital, trauma induced, or oncologic resection induced craniofacial defects, and 
also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or ligament 
5 defects in humans and other animals. Such a preparation employing a tendon/ligament-like 
tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament 
tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and 
in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation 
induced by a composition of the present invention contributes to the repair of congenital, trauma 

10 induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic 
surgery for attachment or repair of tendons or ligaments. The compositions of the present 
invention may provide environment to attract tendon- or ligament- forming cells, stimulate 
growth of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for 

15 return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 

20 cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

25 system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 

30 composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 

35 regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
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kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
5 A composition of the present invention may also be useful for gut protection or 

regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
10 growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
1 5 W09 1/0749 1 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84 (1978). 

20 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 

25 protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 

30 specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
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Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et aL, Toxicology 125: 59-66, 
1 998), skin prick test (Hoffmann et aL, Allergy 54: 446-54, 1 999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et aL, 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
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transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
5 to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 
The efficacy of particular therapeutic compositions in preventing organ transplant 
10 rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl- Acad. Sci USA, 89:1 1 102-1 1 105 
15 (1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
20 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
25 reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 

long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
30 collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 

myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
35 responses may be in the form of enhancing an existing immune response or eliciting an initial 
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immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and 
encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation signal to T 
cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 
MHC class I alpha chain protein and P2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 
an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann etal., J. Immunol. 128:1968-1974, 1982; Handaet al., J. 
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Immunol 135:1564-1572, 1985; Takai et aL, L Immunol. 137:3494-3500, 1986; Takai et aL, J. 
Immunol. 140:508-512, 1988; Bowman et al, J. Virology 61:1992-1998; Bertagnolli et al., 
Cellular Immunology 133:327-341, 1991; Brown etaL, J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
5 will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

10 Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 

that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 

15 Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed 
by dendritic cells that activate naive T-cells) include, without limitation, those described in: 
Guery et al., J. Immunol. 134:536-544, 1 995; Inaba et al., Journal of Experimental Medicine 

20 173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al, Journal of 
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

25 Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 

that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
13:795-808, 1992; Gorczycaet al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 

30 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1 :639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al. Blood 85:2770-2778, 1995; Toki et al, 

35 Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homoditner or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et ah, Nature 
321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 
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A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
5 determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
10 population to another cell population. Suitable assays for movement and adhesion include, 

without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taubetal. J. Clin. Invest. 95:1370-1376, 1995;Lindetal. APMIS 103:140-146, 
15 1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
20 thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 

attributes. Compositions may be useful in treatment of various coagulation disorders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
25 treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
30 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
35 metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
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invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
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Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
5 Daunorubicin HC1, Doxorubicin HO, Estramustine phosphate sodium, Etoposide (V 1 6-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbarnide), Ifosfamide, 
Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen*mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 

10 Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmel amine, Interleukin-2, Mitoguazone, Pentostatin, 
Sernustine, Teniposide, and Vindesine sulfate. 

In addition, dierapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 

1 5 exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 
effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 

20 cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 
tumor systems in nude mice as described in Giovanella et al., J. NatL Can. Inst, 52: 921-30 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 

25 of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 
Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

30 4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
35 their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
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and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
5 receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

1 0 Suitable assays for receptor-ligand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J, E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 

15 Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

20 overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

25 Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 , Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

30 

4.10.13 DRUG SCREENING 

This invention is particularly useftd for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
35 solid support, borne on a cell surface or located intracellularly. One method of drug screening 
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utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of 
complexes between polypeptides of the invention or fragments and the agent being tested or 
examine the diminution in complex formation between the novel polypeptides and an 
appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate {i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 
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The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

4.10.1 4 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 
Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands. or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications ie. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 
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4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
5 cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 

10 shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 

15 Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 

20 acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
25 therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
30 Fishman et al., 1 985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 

i 

intervention with compounds that modulate the activity of the polynucleotides and/or 
35 polypeptides of the invention, and which can be treated upon thus observing an indication of 
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therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spina] cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal coTd 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of the 
nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not limited to 
diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 
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Therapeutics which are useful according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 
5 (i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

(iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

1 0 Such effects may be measured by any method known in the art. In preferred, 

non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 

15 be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc, 

depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

20 invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

25 muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

30 A polypeptide of the invention may also exhibit one or more of the following additional 

activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 

35 (such as, for example, breast augmentation or diminution, change in bone form or shape); 
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effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
10 hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
20 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

25 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

30 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

35 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
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absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
5 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

1 0 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, bit Arch. Allergy Appl. Immunol., 23:129. 

1 5 Induction of the disease can be caused by a single injection, generally intradennally , of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

20 The procedure for testing the effects of the test compound would consist of intradennally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 

25 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
30 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 



62 



WO 01/57188 



PCT/US01/03800 



One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
5 exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01 ng/kg to 100 mg/kg of body weight, with 

10 the preferred dose being about O.lfig/kg to 10 mg/kg of patient body weight. For parenteral 

administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 

1 5 additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

20 A protein or other composition of the present invention (from whatever source derived, 

including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 

25 may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of , the 

30 invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

M-CSF, GM-CSF, TNF, IL-1JL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1 , TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 

35 include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
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factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 

5 treatment Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 

10 inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl , IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 

15 invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g. , at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 

20 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 

25 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

30 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

35 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
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hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
factors), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
5 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4 J2.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
10 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
15 ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic maimer, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often 
in a depot or sustained release formulation. In order to prevent the scarring process frequently 
20 occurring as complication of glaucoma surgery, the compounds rtiay be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

25 The polypeptides of the invention are administered by any route that delivers an effective 

dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

30 similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
35 be formulated in a conventional manner using one or more physiologically acceptable carriers 
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comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration 
chosen. When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered orally, protein or other active ingredient of the present 
invention will be in the form of a tablet, capsule, powder, solution or elixir. When administered 
in tablet form, the pharmaceutical composition of the invention may additionally contain a solid 
carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 
95% protein or other active ingredient of the present invention, and preferably from about 25 to 
90% protein or other active ingredient of the present invention. When administered in liquid 
form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, 
mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 

* 

about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyro'gen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous 
solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer's 
solution, or physiological saline buffer. For transmucosal administration, penetrants appropriate 
to the barrier to be permeated are used in the formulation. Such penetrants are generally known 
in the art. 
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For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
5 treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 

10 gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 

carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP), If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 

15 talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 

20 gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 

sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 

25 stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 

30 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount Capsules and cartridges of, e.g., gelatin for use 
in an inhaler or insufflator may be formulated containing a powder mix of the compound and a 

35 suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
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administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g. , in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied; for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
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sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as drmethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
5 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 

10 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

1 5 polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

20 monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

25 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 

30 MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 

35 . which protein of the present invention is combined, in addition to other pharmaceutically 
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acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
5 liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871 ; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 

herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 

1 0 the condition being treated, and on the nature of prior treatments which the patient has 

undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 

1 5 of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \ig to about 100 mg (preferably about 0.1 ng to about 10 mg, more preferably 
about 0. 1 \xg to about 1 mg) of protein or other active ingredient of the present invention per kg 

20 body weight. For compositions of the present invention which are useful for bone, cartilage, 

tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 

25 delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 

30 cartilage formation, the composition would include a matrix capable of delivering the 

protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 
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The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability . Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
(including hydroxy alkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-JJ), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 



WO 01/57188 



PCT/US01/03800 



patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
5 damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution 
and with inclusion of other proteins in the pharmaceutical composition. For example, the 
addition of other known growth factors, such as IGF I (insulin like growth factor I), to the final 

10 composition, may also effect the dosage. Progress can be monitored by periodic assessment of 
tissue/bone growth and/or repair, for example, X-rays, histomorphometric determinations and 
tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 

1 5 mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

20 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 

25 effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 

30 circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 
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A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al. 3 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p. 1 . Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 u,g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about OA fig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 



73 



WO 01/57188 



PCT/US01/03800 



The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
5 invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 

10 invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a t>, F ab ' and 
fragments, and an F a b expression library. In general, an antibody molecule obtained from 

1 5 humans relates to any of the classes IgG, IgM, IgA, lgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species, 

20 An isolated related protein of the invention may be intended to serve as an antigen, or a 

portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

25 antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, (for example the amino acid sequence shown in SEQ ID NO: 1 351), 
and encompasses an epitope thereof such that an antibody raised against the peptide forms a 
specific immune complex with the full length protein or with any fragment that contains the 
epitope. Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 

30 15 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. 
Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are 
located on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g. , a 

35 hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
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indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 

to encode surface residues useful for targeting antibody production. As a means for targeting 

antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

may be generated by any method well known in the art, including, for example, the Kyte 
5 Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g. , 

Hopp and Woods, 1981, Proc. NaL Acad. Sci USA 78: 3824-3828; Kyte and Doolittle 1982, J. 

Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 

Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 

fragments, analogs or homologs thereof, are also provided herein. 
10 A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

thereof, may be utilized as an immunogen in the generation of antibodies that 

immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 

monoclonal antibodies directed against a protein of the invention, or against derivatives, 
15 fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 

Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 

Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13*1 Polyclonal Antibodies 

20 For the production of polyclonal antibodies, various suitable host animals (e.g. , rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

25 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

30 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

35 synthetic trehalose dicorynomycolate). 



75 



WO 01/57188 



PCTAJS01/03800 



The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g. , from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
5 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

10 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity determining regions (CDRs) of the monoclonal 

15 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 

binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, 

20 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or a 
fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human 

25 origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources 
are desired. The lymphocytes are then fused with an immortalized cell line using a suitable 
fusing agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal 
Antibodies: Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell 
lines are usually transformed mammalian ceDs, particularly myeloma cells of rodent, bovine and 

30 human origin Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can 
be cultured in a suitable culture medium that preferably contains one or more substances that 
inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the 
culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 

35 thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
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Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol. 133 :3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (R1A) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem. . 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from the 
culture medium or ascites fluid by conventional immunoglobulin purification procedures such 
as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, 
dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
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example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368 , 
812-13 (1 994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
5 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

10 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab r , F(ab r ) 2 or other antigen- 

1 5 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co- workers (Jones et al., 
Nature, 321:522-525 (1986); Riechrnann et al., Nature , 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

20 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

25 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechrnann et al., 1988; and Presta, Curr, Op. Struct. Biol. . 

30 2:593-596(1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
35 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
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Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: MONOCLONAL 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
5 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

10 including phage display libraries (Hoogenboom and Winter, J. MoL BioL , 227:381 (1991); 
Marks et al., J. Mol. Biol. , 222 :581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

15 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 1 0, 779-783 (1992)); Lonberg et al 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-1 3 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-5 1 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 

20 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

25 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

30 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B 
cells which secrete fully human immunoglobulins. The antibodies can be obtained directly from 
the animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

35 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
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immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
5 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

10 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

15 an expression vector containing a nucleotide sequence encoding a light chain into another 

mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 

20 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 F a b Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
25 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
30 may be produced by techniques known in the art including, but not limited to: (i) an F^b^ 

fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F^ ^ fragment; (iii) an F a b fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

35 5.13.5 Bispecific Antibodies 
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Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 
5 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature. 305:537-539 (1 983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

10 potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993,andinTrauneckeref a/., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

15 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 

preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 

20 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al., Methods in Enzymology , 121:210 (1986). 

According to another approach described in WO 96/2701 1 , the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

25 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

30 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

35 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
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wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments axe reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab s fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab 5 -TNB 
derivatives is then reconverted to the Fab ? -thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody .F(ab*)2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelnv et aL J. Immunol. 148(5): 1547- 1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad. ScL USA 90:6444-6448 (1 993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al.. J. Immunol. 147:60(1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
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a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG(Fc R),suchasFc RI(CD64),Fc RII (CD32) and Fc RIII (CD 16) so as to focus 
cellular defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies 
can also be used to direct cytotoxic agents to cells which express a particular antigen. These 
antibodies possess an antigen-binding arm and an arm which binds a cytotoxic agent or a 
radionuclide chelator, such as EOTUBE ? DPTA, DOTA, or TETA. Another bispecific antibody 
of interest binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a fhioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et ah, J. Exp Med. s 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g. , an enzymatically active toxin of 
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bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (ie., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
5 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogeliin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

10 radionuclides are available for the production of radioconjugated antibodies. Examples include 
2,2 Bi, ,3, I, 13, In ) 90 Y,and I86 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of Afunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), Afunctional derivatives of imidoesters (such as dimethyl adipimidate HCL) ? 

15 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 

20 Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyIdiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody -receptor conjugate is 
25 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

30 In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

35 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
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artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
5 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

1 0 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 

1 5 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1350 or a representative 
fragment thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide 

20 sequences of SEQ ID NO: 1 -1 350 in computer readable form, a skilled artisan can routinely 
access the sequence information for a variety of purposes. Computer software is publicly 
available which allows a skilled artisan to access sequence information provided in a computer 
readable medium. The examples which follow demonstrate how software which implements the 
BLAST (Altschul et al., J. MoL Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. 

25 Chem. 1 7:203-207 (1 993)) search algorithms on a Sybase system is used to identify open reading 
frames (ORFs) within a nucleic acid sequence. Such ORFs may be protein encoding fragments 
and may be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 

30 means, and data storage means used to analyze the nucleotide sequence information of the 

present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 

35 computer-based systems of the present invention comprise a data storage means having stored 
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therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
5 the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are vised to identify 
fragments or regions of a known sequence which match a particular target sequence or target 

10 motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 

15 software packages for conducting homology searches can be adapted for use in the present 

computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 

20 sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
25 selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
30 sequences). 



4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
35 methods are based on the binding of a polynucleotide sequence to DNA or RNA, 
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Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 
5 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA 
transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA 
molecule into polypeptide. Both techniques have been demonstrated to be effective in model 
systems. Information contained in the sequences of the present invention is necessary for the 
10 design of an antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
15 acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
20 detected, a polynucleotide of the invention is detected in the sample. Such methods can also 

comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

25 In general, methods for detecting a polypeptide of the invention can comprise contacting 

a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 

30 antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 

35 skilled in the art will recognize that any one of the commonly available hybridization, 
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amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques. Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Irnmunocytochemistry, 
5 Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 

10 will vary based on the assay format, nature of the detection method and the tissues, cells or 

extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 

1 5 necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

20 In detail, a compartment kit includes any kit in which reagents are contained in separate 

containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 

25 compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 

30 primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



35 4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are usefiil in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutical^ acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 



4.18 SCREENING ASSAYS 
10 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 

encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID N 0:1- 

1350, or bind to a specific domain of the polypeptide encoded by the nucleic acid. In detail, said 

method comprises the steps of: 
1 5 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) detenrdiiing whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992) ; pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al.,Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using know techniques to generate a pharmaceutical 
composition. 



1 0 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-1 350- Because the corresponding gene is only expressed in a limited 
15 number of tissues, a hybridization probe derived from of any of the nucleotide sequences SEQ 
ID NO: 1- 1350 can be used as an indicator of the presence of RNA of cell type of such a tissue in 
a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965.188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

10 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e. , small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) X Clin. Microbiol. 28(6) 1 469-72); 
using UV light (Nagataef at, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) MoL Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 

20 references being specifically incorporated herei a 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. Sci. USA 91 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, 

25 Oslo. Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
3 0 surface termed Covalink NH . Co vaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussenet al, (1991) Anal. Biochem. 198(1)138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussen et al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1 (8) 65 1 3-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 run long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7 . 5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0. 1 M 1-methylimidazole, 
pH 7.0 (1-Melm7) f is then added to a final concentration of 10 mM l-MeLm?. A ss DNA solutionis 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopropyl)-carbodiirnide (EDC), dissolved in 

15 10 mM 1 -Melm7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiesterlink to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodor et al (1991) Science 25 1(4995) 767-73, incorporated herein by reference, Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 991) Nucleic Acids Res. 
19(1 2) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et al } (1994) PNAS USA 91(1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotideprobes in high-density, miniaturized arrays, utilize photolabile 
5-protectedTV-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4 21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
1 5 including mRN A without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplificationmethods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schrieferef a/. (1990)Nucleic 

Acids Res. 18(24) 7455-6, incorporatedhereinby reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell . The results of 
these studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

3 0 fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, CV/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (CviJI* *), yield a quasi-random distribution of DNA fragments form the small 
moleculepUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CviJl* * digest of pUCl 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cvi JI* * restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

2 5 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarrays may contain 64 samples, one from each patient Where the 96 subarrays are identical, the 
dot span may be 1 mm and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety, 

20 5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences 
which flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and 
screened with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were 
clustered into groups of similar or identical sequences. Representative clones were selected for 

30 sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
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(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDN A Ends) was performed to further extend the sequence in the 5' direction. 

5.2 EXAMPLE 2 

5 Novel Contigs 

The novel contigs of the invention were assembled from sequences that were obtained from 
a cDNA library by methods described in Example 1 above, and in some cases sequences obtained 
from one or more public databases. The sequences for the resulting nucleic acid contigs are 
designated as SEQ ID NO : 1-1350 and are provided in the attached Sequence Listing. The contigs 

1 0 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(/.e. ? Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 1 4, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 

1 5 component sequences into the assemblage was based on a BL ASTN hit to the extending 
assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Table 3 sets forth the novel predicted polypeptides (including proteins) encoded by the 
novel polynucleotides (SEQ ID NO : 1 89-282) of the present invention, and their corresponding 
nucleotide locations to each of SEQ ID NO: 1 89-282. Table 3 also indicates the method by which 

20 the polypeptide was predicted. Method A refers to a polypeptide obtained by using a software 

program called FASTY (available from http://fasta.bioch.virginia.edu) which selects a polypeptide 
based on a comparison of the translated novel polynucleotide to known polynucleotides (W.R. 
Pearson, Methods in Enzymology, 1 83 ;63-98 ( 1 990), herein incorporated by reference). Method B 
refers to a polypeptide obtained by using a software program called GenScan for human/vertebrate 

25 sequences (available from Stanford University, Office of Technology Licensing) that predicts the 
polypeptide based on a probabilistic model of gene structure/compositionalproperties (C. Burge 
and S. Karlin, J. Mol. Biol. 5 268:78-94 (1997), incorporated herein by reference). Method C refers 
to a polypeptide obtained by using a Hyseq proprietary software program that translates the novel 
polynucleotide and its complementary strand into six possible amino acid sequences (forward and 

30 reverse frames) and chooses the polypeptide with the longest open reading frame. 

The nearest neighbor results for SEQ ID NO: 1-1350 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq database October 
12, 2000, update 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the 
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closest homologue for SEQ ID NO: 1 -1350. The nearest neighbor results for SEQ ID NO: 1- 
1350 are shown in Table 2 below. 

Tables 1, 2 and 3 follow. Table 1 shows the various tissue sources of SEQ ID NO: 1-1350. 
Table 2 shows the nearest neighbor result for the assembled contig. The nearest neighbor result 
shows the closest homolog with an identifiable function for each assemblage. Table 3 contains the 
start and stop nucleotides for the translated amino acid sequence for which each assemblage 
encodes. Table 3 also provides a correlation between the amino acid sequences set forth in the 
Sequence Listing, the nucleotide sequences set forth in the Sequence Listing and the SEQ ID NO. in 
USSN 09/496,914. 
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TABLE 1 



l issue ungin 


kin A oource 


riyseq jLibrary Name 




adult brain 


GIBCO 


AB3001 


Ill 151 188 215 662-665 877 910 927 
976 1233 1319 


adult brain 


GIBCO 


ABD003 


41 49 74101 111 120 132 141-142 151 
2 1 7 225 238 271 3 1 7 404 446 469 503 
513-514 535 550 564 573 666-669 798 
898 910 927 976 1067 1083 1085 1178 
1254 


adult brain 


Clontech 


ABR001 


39 216 238 327 356 535 927 1056 1 121 
1178-1180 1199 1251 


adult brain 


Clontech 


ABR006 


74 611 949 1034 1136 


adult brain 


Clontech 


ABR008 


14 32 41 61 81 86 89 120 132 138 145 
147 188 197 208 225 227-239 250 300- 
303 312 316 328-331 340357-362 374 
380 384-391 408 414 446 448 464-467 
483 488 495-496 505 5 12 521 535 550 
566 571 577 585 590 594 598 634 641 
658 666 683 725 742 764 767 786 801 
805 820 823 826 829 831 836 841 887- 
923 927 934 943 950-951 963 976 995 
1000-1001 1006 1026 1034 1048 1057- 
1067 108610881090 111811201122- 
1128 1142 1162 1181-11921199 1204 
1218-1219 1225 1232 1253 1267 1271- 
1306 13421347 1349-1350 


adult brain 


Clontech 


ABR011 


49238 1219 


adult brain 


BioChain 


ABR012 


74 238 


adult bram 


Invitrogen 


ABR013 


868 1268 


adult brain 


Invitrogen 


ABT004 


49 117 138 191217 252 291 305 535 
566 596 663 670 746 798 816-819 876 
892 898 922 943 963 1034-1036 1121 


cultured 
preadipocytes 


Strategene 


ADP001 


41 74 101 138 21 1 238 304 537 582 
740 798 883 943 976 1067 


adrenal gland 


Clontech 


ADR002 


49 74 101 111 120127 151 215 238 
240-247 316 330 363-364 404 414 534- 
535 833 924-940 950 963 976 1001 
1003 1067-10701118 1156 1193-1200 
1325 


adult heart 


GIBCO 


AHR001 


38 49 71-72 74-77 79 92 99101 111 
118 129 132 138 151 158-163 182 195- 
203 215 217 238 264 269 353 384 398 
408 434-439 446 504 512-513 519 537 
562-573 577 611-614 616-619 658 661 
671-672 722 734 757-773 815 828-835 
874 891 898 919 926-927 976 988 
1021 1037 1041 1062 1067 1071 1080 
1083 1093 11221131 11851201 1254 
1308 1331 1335 


adult kidney 


GIBCO 


AKD001 


41 49 51 71-74 78-85 94 100-101 103- 

i /VI iii tin i ""ifY 110 i c i i tic nn 

107 111 119-120 138 151 157 215 217- 
21 8 238 250 264 294 304 384 404 440 
446 454 477 504-505 509 514 518-519 
535 537 564 574-583 620-627 639 653 
673-675 705 753 789 831 844 851 859 
877 909 918 927 956 963 976 1067 
1074 1083 1095 1178 1302 1331 1335 


adult kidney 


Invitrogen 


AKT002 

X 


11-12 41 49 111-112 215-217 294 316 
446 487 564 575 844 868 910 927 976 
1116 


adult lung 


GIBCO 


ALG001 


8 101 111 151 187 402 446 490 514 
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518 537 545 549 580 582 592 594 634 
640 651-652 676-678 725 851 873 918 
952 976 1042 1067 1076 1083 1152 


lymph node 


Clontech 


ALN001 


8 111 121 151 180-182188 215 537 
545 549 651 679-682 789 804-810 868 
873 927 952 976 1042 1059 1335 


young liver 

i 


GIBCO 


ALV001 


8 64 79 111 186 215-216 238 446 514 
5 1 9 537 564 653 683-684 698 753 798 
813 833 840 858927 976 1038-1039 
1051 1085 1224 3245 1256 


adult liver 


Invitrogen 


ALV002 


40 71 292-293 305 384 468-469 496 
505 657 675 714 753 832 844 941-942 
976 1040 1076 12561293 


adult liver | Clontech 


ALV003 


976 


adult ovary 


Invirrogen 


AOV001 


8 32 36 3841 49 51 71 74 79-80 101 
104 1 1 1 120 122-125 138 140 143-149 
151 188-190 207-212 215-217 238 264 
3 1 6 384 409 440 445-446 496 504 5 1 2 
514 518-519 535 537 549-550 564 566 
571 580 582 600 618 638 657 667 681 
685-697 699 705 722 735-744 761 771 
815 833 842-S65 868 875-876 918 926- 
927 950 952 963 976 1023 1042 1048 
1051 1059 1072 1076 1083 1117 1120 
1124 1131 1144 1174 12241268 1331 
1335 


adult placenta 


Clontech 


APL001 


102 217238 537 641 700 


nlafrenta 


Invitroeen 


APL002 


663 851 1048 


adult snleen 


GIBCO 


ASP00I 


845 74 111 132 140 1 51 185 217 238 
294 414 446 477 504 5 14 534 545 549 
592 722 873 883 952 976 1041-1042 
1083 1 093-1094 1152 1224 


testis 


GIBCO 


ATS001 


72 107 111 113 126 140 151 183 215 
238 446 497 53 7 642 701-706 811 877 
927 962 9761083 1117 1131 


adult bladder 


Invitrogen 


BLD001 


41 151 191402-405 409 414 496 545 
592 607 706 873 952 1 178 1329-1335 


bone marrow 


Clontech 


BMD001 


8 58-62 65-68 74 79 108 111 116 137 
147 151 164-174 213-215 238 305-307 
374 404 446 460 466 516 519 534 538- 
541 544-546 549-554 566 584 586 592 
596 607 610 628-629 643-645 652 707- 
708 774-789 844 866-871 873 919 927 
952 963 976 998 1034 1042 1064 1083 
1085 1120 1132 1152 1225 1229 1268 
1307 1310 


bone marrow 


Clontech 


BMD002 


6 8 37-38 52 74 77 105 111 129 132 
210 317 510-511 545 549 581 598 628 
638 724 766 789 844 860 868 873 919 
927 952 963 968 976 1042 1111 1141 
1160-1161 1229 1266 1346 


bone marrow 


Clontech 


BMD004 


111 238 282 549 1083 


adult colon 


Invirrogen 


CLN001 


52 260 264 299 494 536 545 564 592 
844 873 877 952 976 1042 1 152 1268 
1336-1337 


adult cervix 


BioChain 


CVX001 


49 51 129 132 151 205 207 238 332- 
335 365-367 392-401 440 466 470-471 
51 8 537 597 629 832 877 927 976 1006 
1085 1117 1129-1134 1192 1202-1205 
1219 1309-1328 


diaphragm 


BioChatn 


DIA002 


74 976 1083 
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SEQ ID MUt>: 


enflotnenai ecus 


oiraicgene 


cLfl UU1 


19 A(\JL \ 40 1A 1Q 101 111 17ft 119 
JZ 4U-4 1 /4 /y lUi 111 1ZU 1 -5Z 

l->5 1 Jl ZII4-2U& 21j-Z1 / 2Jo 2t>y 31o 

>i 1 A All «A< CIA C11 C<A <CC CPA COO 

414 4JJ jIIj jIU jlj j jU jjj joU joZ 
596 675 777 745 798 R14 R36-R41 851 

QIC Q7£ 1 04 1 1 (\A1 1 071 1 ni?"? 1 t 
7l o " / O 1 Lh+ J lu*lj 1U/ J IvoJ J 1 J 1 

1 J J i 


VJCIJUlJJlVf blUIICa 


VJGJ JtsllJIW J-^li /V 




•S7S-S19 n ?7 


IXOIJl LIlC oiiun dl III 


uom ijcneuc 






oi enromosome o 


itesearcn 








fipnAtntf* HWA 
VJCijuiiiiib' LJiy/-\ 


FPM001 


47 S?S 


frfim the short arm 


ftnm Genetic 






of chromosome 8 


Research 






Genomic clones 


Genomic DNA 


EPM004 


525 927 


from the short arm 


from Genetic 






of chromosome 8 


Research 






Genomic clones 


Genomic DNA 


EPM005 


531 


from the short arm 


from Genetic 






of chromosome 8 


Research 






esophagus 


BioChain 


ESO002 


74 138 238 


reiaj Dram 


doniecn 


rrRX? AA1 


441-442 7Z / 


ieiaj orain 


doniecn 


CRT) C\C\A 


zlJ oyj yLI lUUl 


fetal brain 


Clontech 


FBR006 


48 61 101 120 132 138 140 147 208 
225 271 317 319 336 359 368 405-414 

O Q ? <A <T1 <A/I /CO/C TK TOO *7£/l OO /f 

JlyjjyJD/i jy4 Ooo flJ ILL /04 oz4 

S9Q 0<O OAO QIT Q/fl QylT Q/^Tl lA^T 

ozy &do ojy yuy vz / y4j y4/ yoj iud/ 
1067-1068 1104 1135-1140 1162 1206- 

191^ 19£ft HAI llAC 111Q 
ll\jf IZJJ 1ZOO lZpo 1 JU/-ljUo 1^1" 


■fotal rvrain 
IClai Ulaili 






111 tto 


"fptfll hrnin 
i^iai uj ct-ii i 


In vitrnopn 


RRT007 


4 1 S 1 1 70 1 S 1 1 07- 1 04 7/vl S04 S 1 7 

535 683 761 798 820-827 844 876 909 
963 976 1026 1048 1083 1 144 1302 


fetal heart 


Invitrogen 


FHR001 


446 566 761 


fetal kidney 


Clontech 


FKD001 


51 74 111 127 140151 184 294 537 
550 630-631 1319 


fetal kidney 


Clontech 


FKD002 


111 9761083 


fetal kidney 


Invitrogen 


FKD007 


238 974 


fetal lung 


Clontech 


FLG001 


463 566 976 10741083 1093 


fetal lung 


Invitrogen 


FLG003 


41 238 330407 415-416 537 573 844 
859 1048 1083 1116 1192 


fetal liver-spleen 


Columbia 
University 


FLS001 

• 


8 14 34-35 37 41 43 49 51 54-56 63-64 
69-71 74 77 79 87-90 101 107 1 10-1 1 1 
1 14 120 128-13 1 138 140 147 150-155 
197 210 215 217 225 238 312 367 384 
414 440 446 460 468 483 496 504-507 
511-515 518-519 523 533-535 537 541 
544-545 547-550 555-560 564 566 571 

ctj coi coc cot cno CIC HAC £.AH CACX 

bll 5Sz 5oj-5ao 5y& o 3o 64o-o47 64y 
652 664 698 709-710 714 722-723 731 
735-736 746-753 761 784 798 823 829 
832 844 851 858-859 868 873 876 898 
927 943 949 952 963 976 984 1002 

mOI lAO^ 1A4A lA/tO 1 AjM inCA 1AO-2 
1UZ1 ll/ZO l\}h\J IV/'tZ llW^l IUjU IKJoj 

1093 1116 11201129 1131 1144 1174 
1217 1251 1254 1256 1302 1308 1311 
1319 


fetal liver-spleen 


Columbia 
University 


FLS002 


8 36-37 41-46 49 54 64 71 74 79 101 
111 120 129 147207 210215-216238 
250 330 353 359 366 383-384 414 478 
505 508-509 511 515-524 534-535 537 
544-545 564 566 571 577 591 598 638 
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663 671 698 714 722 725 727 751 798 
851 859 873 876 909 927 949 952 983- 
984 1002 1023 1042-1044 1085 1095 
1131 1144 1178 1199 1233 1240-1270 
1331 1340 


fetal liver-spleen 


Columbia 
University 


FLS003 


64 535 976 1256 


. it* 

fetal liver 


Invitrogen 


FLV001 


8 101 120 138 217 446 468 535 566 
580 722 730 749 844 918 943 976 1051 
1256 1331 


fetal liver 


Clontech 


FLV004 


537 926 1256 


fetal muscle 


Invitrogen 


FMS001 


51 111 264 312 369-370 404 417-421 
425 535 537 577 598 614 836 857 f 141 
1208 1268 


fetal muscle 


Invitrogen 


FMS002 


537 


fetal skin 


Invitrogen 


FSK001 


13-26 32 41 51 89 107 111 147 151 
225 264 316 405 422-429 488-494 496 
519 534-535 537 566 675 732 859 876- 
877 898 947 949-950 963 976 1001 
1062 1076 1083 1117 1144 1165 1268 
1281 


fetal skin 


Invitrogen 


FSK002 


537 812 


fetal spleen 


BioChain 


FSP001 


87 549 


umbilical cord 


BioChain 


FUC001 


27-33 41 49 151 215 238248-249 301 
316 446 495-503 519 521 534-535 537 
582 634 691 877 883 927 944-950 963 
976 1001 1075 1142-1143 1171 1218 
1243 1308 


fetal brain 


GIBCO 


HFB001 


41 49 57 79 87 103 111 120132-135 
138 145 151 188 197207215238264 
271 294 316 367414 440 446 466 504 
513-514 535 542-543 550 564 571 596 
635 648-654 675 71 1-715 722-723 798 
832 872 876 883 927 976 1095 1 144 
1168 1171 1178 1211 1335 


macrophage 


Invitrogen 


HMP001 


238 


infant brain 


Columbia 
University 


IB2002 


49-50 77 81 89 105 111 136-138 140 
151 161 175-179 185 216-217264 295 
299 308-310 371-373 462 476 504 511- 
513 533 53 7 564 566 571 655-657 662 
683 716-720 723 752 790-803 829 832 
858-859 876 898 909 949 976 1045- 
1047 1076-1087 1090 1093 1116 1122 
1144 3209-1213 1225 1233 1256 1319 
1341 


infant brain 


Columbia 
University 


IB2003 


41 50 77104 132215238508512-513 
519 566 655 714 794 918 943 976 1067 
1092-1093 1233 


infant brain 


Columbia 
University 


IBM002 


311 472-473 753 1214 


infant brain 


Columbia 
University 


IBS001 


51 111 376 474 790 876949 1144 1204 
1221 


lung , fibroblast 


Strategene 


LFB001 


151 316 462 514 534 582 675 939 1 131 


lung tumor 


Invitrogen 


LGT002 


1-7 41 74 79 94 115 120 138-139 156 
215 217269 280296 337 374-375 384 
404 446 454 475-480 498 5 1 4 5 1 8-5 1 9 
522 537 545 564 577 597 653 658 705 
721-724 754-756 779 859 868 872-874 
876-877 919 927 949 951-952 959 976 
1002 1042 1048-1053 1076 1083 1088- 
1089 1131 1144-1147 1216-1218 1229 
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1293 1311 


lymphocytes 


ATCC 


LPC001 


41 74 111 132 151 253 316446 550 
634 844 927 976 1085 1268 


leukocyte 


GBCO 


LUC001 


8 11 41 74 86 91-98 101 109111 120 
147 151 212 215 21S 238 252 288 312- 
314.316 338 359 408 427 443-447 505 
510 512 514 518 534 545 549-550 561 
564 566 571 577 580 582 587-609 615 
632-63 8 658-659 698 714 725-728 832 
836 841 859 866 873-874 882-883 918- 
919 927 943 952 963 976 1042 1076 
1083 1090 1148 1152 1168 1195 1219- 
1220 1224 


leukocyte 


Clontech 


LUC003 


74 100 215 23 2 238 339-341 446 545 
65 7 660 729 873 883 927 952 963 1008 
1042 11161120 1149-1150 1215 1222 


Melanoma from cell 
line ATCC #CRL 
1424 


Clontech 


MEL004 


210 215 238 342 534 545 592 722 873 
919 929 939 952 976 1071 1118 1218 
1235 1245 


mammary gland 


Invitrogen 


MMG001 


8-10 40-41 49 73 80 1 14 138-140 147 
217 250-256 264 297-299 305 377-378 
398 446 481-486 505 5 12 537 545 549 
571 592 725 730-733 816 829 836 844 
868 873 876-877 898 926 943 951-960 
963 976 995 1 034 1042 1048 1054- 
1055 1076 1083 1091 10931116-1117 
1124 1152 1302 


induced neuron cells 


Strategene 


NTD001 


39 101 111 138 238 361 1225 1251 
1319 


retinoid acid induced 
neuronal cells 


Strategene 


NTR001 


74 225 976 


neuronal cells 


Strategene 


NTU001 


129 225 238 304 3 13 361 657 976 


pituitary gland 


Clontech 


PIT004 


976 


placenta 


Clontech 


PLA003 


38 976 


prostate 


Clontech 


PRT001 


111 188 238 257-258 564 724 961-966 
1067 1095 


rectum 


Invitrogen 


REC001 


238 430-431 841 859 868 963 1001 
1116 


salivary gland 


Clontech 


SAL001 


8 151 402 432-433 446 496 868 952 
976 10831120 1151 1184 


small intestine 

■ 


Clontech 


SIN00I 


8 101 147 215 259-266 446 462 505 
545 592 660 789 836 866 873 927 952 
963 967-978 1042 1 120 1 152 1223- 
1224 


skeletal muscle 


Clontech 


SKMOOt 


238 302 927 943 992 1031 


spinal cord 


Clontech 


SPC001 


74 111 132 151 215-216 238 264 267- 
270 343-344 353 379 516 537 566 740 
828 927 976 979-994 1092 1 153-1 159 
1225 1250 


adult spleen 


Clontech 


SPLcOl 


698 859 1042 


stomach 


Clontech 


STO001 


210 238 271-272 537 580 705 918 952 
995 1171 


thalamus 


Clontech 


THA002 


61 219-220 273-276 312 315 330 596 
963 996-1007 1059 1093 1160-1162 


thymus 


Clonetech 


THM00I 


8 120 151208 221316-317353 639 
750 867 874 878-881 927 963 1023 
1083 1094-10961124 


thymus 


Clontech 


THMc02 


8 61 114 129 132 210 225231306 
317-319 336 340 359 380 398 446 448- 
463 512 519 545 554 587 598 698 724- 
725 789 812 836 868 873 927 947 952 
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976 1007 1042 1083 1085 1097-1116 
1122 1147 1177 1226-12291234 1311 
1313 


thyroid gland 


Clontech 


THR001 


14 41 49 76 94 111 144 151 183 188 
210 217 222 253 264 271 277-286 294 
320-326 345-352 361 381-382446 467 
483 514 534 549-550 564 578 602 649 
844 882-883 927 950 956 976 1008- 
1028 1076 1083 1117-11201142 1163- 
1175 1230-1238 1308 


trachea 


Clontech 


TRC001 


223-225 238 287 353-354 514 
545 592 oil of 5 oco-oo4 yJ.1 
952 1029-1031 1042 1151-1152 
1170 1176-1177 1239 


uterus 


Clontech 


UTR001 


151 226 288-290 355 537 877 
885-886 976 1001 1032-1033 
1232 



TABLE 2 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1 


B02829 


Homo sapiens 


Human G protein coupled receptor hRUP5 
protein SEQ ID NO: 10. 


460 


100 


2 


G03564 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7645. 


111 


51 


3 


R26173 


Homo sapiens 


Part of Major Yo paraneoplastic antigen 
(CDR62) encoded by clone pY2. 


293 


76 


4 


L29536 


Homo sapiens 


calcium channel L-type alpha 1 subunit 


191 


65 


5 


Y94943 


Homo sapiens 


Human secreted protein clone ytl4I protein 
sequence SEQ ID NO:92. 


251 


50 


6 


Ml 1507 


Homo sapiens 


transferrin receptor 


120 


95 


7 


AF099100 


Homo sapiens 


WD-repeat protein 6 


1941 


93 


8 


Y92338 


Homo sapiens 


Human cancer associated antigen precursor from 
clone NY-REN-45. 


245 


82 


9 


G01343 


Homo sapiens 


Human secreted protein, SEQ ID NO; 5424. 


226 


91 


10 


A J 133798 


Homo sapiens 


copine VII protein 


1127 


68 


11 


G02449 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6530. 


584 


99 


12 


X98330 


Homo sapiens 


ryanodine receptor 2 


282 


78 


13 


AL024498 


Homo sapiens 


dJ417M14.2 (novel serin e/threo nine-protein 
kinase (ortholog of mouse and rat MAK (male 
germ cell -associated kinase)) 


293 


100 


14 


AF045577 


Pan 

troglodytes 


olfactory receptor OR93Ch 


191 


36 


15 


G03131 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7212. 


93 


39 


16 


U26595 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 


569 


89 


17 


B08918 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 28 SEQIDNO:75. 


99 


44 


18 


Y36203 


Homo sapiens 


Human secreted protein #75. 


165 


75 


19 


U 15647 


Mus 

musculus 


reverse transcriptase 


106 


40 


20 


G02701 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6782. 


544 


100 


21 


Y35923 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 172. 


1691 


100 


22 


G04030 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8111. 


380 


96 


23 


G02455 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6536. 


123 


50 


24 


AF036329 


Homo sapiens 


gonadotropin-releasing hormone precursor, 
second form 


284 


90 


25 


G04067 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8148. 


96 


32 


26 


S801 19 


Rattus sp. 


reverse transcriptase homolog 


100 


34 


27 


U83303 


Homo sapiens 


line-1 reverse transcriptase 


101 


35 


28 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7348. 


135 


45 
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SEQ 


Accession 


Species 


Description 


Smith- 


% 


ID 


No. 






Waterman 


Identity 


NO: 








Score 




29 


G04067 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8148. 


83 


42 


30 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


116 


72 


31 


G03371 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7452. 


96 


67 


, . 

32 


G03224 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7305. 


58 


32 


33 


Y66688 


Homo sapiens 


Membrane-bound protein PRO U 52. 


2457 


98 


34 


Y87071 


Homo sapiens 


Human secreted protein sequence SEQ ID 


348 


95 








NO:110. 






35 


U15131 


Homo sapiens 


pi 26 


182 


48 


36 


Y73464 


Homo sapiens 


Human secreted protein clone yl4_l protein 


982 


90 








sequence SEQ ID NO: 150. 






37 


AL133215 


Homo sapiens 


bA108L7.6 (semaphorin 4G (sema domain, 


687 


99 








immunoglobulin domain (Ig), transmembrane 












domain (TM) and short cytoplasmic domain)) 






38 


AC067969 


ammo acids 


Homo sapiens ryanodine receptor 1 (skeletal) 


386 


66 






3338-4088 








39 


AL031588 


Homo sapiens 


dJl 163J1.1 (mostly supported by GEN SCAN, 


493 


76 








FGENES and GENEWISE) 






40 


G03628 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7709. 


110 


51 


41 


AF 132969 


Homo sapiens 


CGI-35 protein 


228 


68 


42 


Y36268 


Homo sapiens 


Human secreted protein encoded by gene 45. 


220 


88 


43 


X61048 


Hydra sp. 


mini-collagen 


105 


35 


44 


M76546 


HeJianthus 


hydroxyproline-rich protein 


no 


31 






annuus 








45 


U82288 


Caenorhabditi 


Rac-Iike GTPase 


139 


70 






s elegans 








46 


G03477 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7558. 


118 


58 


47 


AF090942 


Homo sapiens 


PRO0657 


113 


63 


48 


G03564 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7645. 


90 


59 


49 


AJ005560 


Mus 


SPR2B protein 


72 


56 






musculus 








50 


G02450 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6531. 


385 


98 


51 


Y91649 


Homo sapiens 


Human secreted protein sequence encoded by 


973 


94 








gene60SEQlDNO:322. 






52 


U93563 


Homo sapiens 


putative pi 50 


105 


38 


53 


Y55927 


Homo sapiens 


Human STLK2 protein. 


699 


85 


54 


G02607 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6688. 


145 


56 


55 


AB008175 


Mus 


hepatic nuclear factor 1-beta short form 


356 


74 






musculus 








56 


M68941 


Homo sapiens 


protein -tyrosine phophatase 


165 


41 


57 


AL031600 


Homo sapiens 


C390E6.1 (chloride channel 7) 


338 


76 


58 


AF011417 


Mus 


putative pheromone receptor 


143 


55 






musculus 








59 


AF167320 


Mus 


zinc finger protein ZFP1 1 3 


558 


68 






musculus 








60 


U73036 


Homo sapiens 


interferon regultory factor 7 


263 


96 


61 


X07984 


Mus 


protein-tyrosine kinase 


297 


69 






musculus 








62 


Y29861 


Homo sapiens 


Human secreted protein clone cb98_4. 


791 


98 


63 


U35376 


Homo sapiens 


repressor transcriptional factor 


485 


65 


64 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain enzyme 


785 


74 








APOLLON 






65 


G03883 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7964. 


88 


95 


66 


AF 177390 


Manduca 


antenna! specific membrane protein AMP 


274 


54 






sexta 








67 


AB040800 


Homo sapiens 


SREB2 


614 


100 


68 


AF030027 


Equine 


24 


213 


26 






herpesvirus 4 








69 


G02965 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7046. 


261 


95 


70 


W75770 


Homo sapiens 


Human oxidoreductase YTF03. 


1144 


98 


71 


AB011135 


Homo sapiens 


KIAA0563 protein 


239 


76 


72 


ABO 14885 


Halocynthia 


HrPOPK-1 


813 


78 






roretzi 








73 


AF045454 


Cavia • 


phospholipase B 


955 


73 






porcellus 








74 


J02870 


Mus 


lominin receptor 


308 


61 



105 
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No. 
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Smith- 

waterman 

Score 


% 

Identity 






musculus 








75 


Y00826 


Rattus 
norvegicus 


gp210(AA 1-1886) 


413 


84 


76 


AF 117754 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP240 


351 


54 


77 


Y38422 


Homo sapiens 


Human secreted protein. 


468 


76 


78 


Y 14596 


Homo sapiens 


Human T-type voltage-gated Ca channel alpha- 
1-1 (hCavT3). 


1357 


99 


79 


Y14591 


Human 
papillomaviru 
s type 68 


APM-1 protein 


767 


100 


80 


AL137802 


Homo sapiens 


dJ798A10.2 (KIAA0445 protein) 


71 


34 


81 


AP000383 


Arabidopsis 
thai i ana 


protein arginineN-methyltransferase-like protein 


359 


65 


82 


L46815 


Mus 

musculus 


DNA binding protein Rc 


895 


75 


83 


G01600 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5681. 


315 


96 


84 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated HSCOP-6. 


538 


71 


85 


AB029002 


Homo sapiens 


KIAA1079 protein 


134 


42 


86 


Y28678 


Homo sapiens 


Human cw272 7 secreted protein. 


325 


62 


87 


Y99368 


Homo sapiens 


Human PRO 13 26 (XJNQ686) amino acid 
sequence SEQ ID NO: 100. 


156 


48 


88 


AJ225124 


Mus 

musculus 


hyperpolarization-activated cation channel, 
HAC3 


487 


95 


89 


AF177203 


Homo sapiens 


cerebral cell adhesion molecule 


290 


56 


90 


Y28280 


Homo sapiens 


Human G-protein coupled receptor GR1R-2. 


326 


79 


91 


L39891 


Homo sapiens 


polycystic kidney disease- associated protein 


1751 


95 


92 


AF064876 


Homo sapiens 


ion channel BCNG-1 


953 


99 


93 


AF 170723 


Homo sapiens 


protein kinase STK10 


401 


53 


94 


X13292 


Trypanosoma 
brucei 


GPI-phospholipase C ( AA 1 - 358) 


151 


37 


95 


Y34127 


Homo sapiens 


Human potassium channel K+Hnovll. 


661 


99 


"9g 


X03638 


Rattus 
norvegicus 


sodium channel protein I (aa 1-2009) 


1775 


92 


97 


AF134213 


Homo sapiens 


ubiquitin-specific protease 


1995 


99 


98 


G00838 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4919. 


213 


38 


99 


AF021935 


Rattus 
norvegicus 


mytonic dystrophy kinase-related Cdc42-binding 
kinase 


675 


48 


100 


AF279265 


Homo sapiens 


putative anion transporter 1 


867 


98 


101 


AC007878 


Homo sapiens 


match to nuclear protein, NP220; note sequence 
difference at residue 58 


160 


60 


102 


U22829 


Mus 

musculus 


P2Y purinoceptor 


264 


42 


103 


Y45023 


Homo sapiens 


Human sensory transduction G-protein coupled 
receptor-B3. 


516 


99 


104 


Y94990 


Homo sapiens 


Human secreted protein vb21_l, SEQ ID NO:20. 


787 


98 


105 


Y87342 


Homo sapiens 


Human signal peptide containing protein HSPP- 
lI9SEQIDNO:119. 


343 


57 


106 


AF169312 


Homo sapiens 


hepatic angiopoietin-related protein 


212 


67 


107 


AF 116657 


Homo sapiens 


PRO1310 


74 


52 


108 


AE000401 


Escherichia 
coli 


sialic acid transporter 


587 


96 


109 


Y38395 


Homo sapiens 


Human secreted protein encoded by gene No. 10. 


693 


100 


110 


Y78801 


Homo sapiens 


Hydrophobic domain containing protein clone 
HP00631 amino acid sequence. 


182 


94 


111 


Z25535 


Homo sapiens 


nuclear pore complex protein hnupl53 


464 


85 


112 


Y94939 


Homo sapiens 


Human secreted protein clone ye90_l protein 
sequence SEQ ID NO:84. 




CI 

31 


113 


AF016365 


Homo sapiens 


bexokinase 1 isoform td 


301 


71 


114 


AC007956 


Homo sapiens 


unknown 


520 


75 


115 


M83738 


Homo sapiens 


protein-tyrosine phosphatase 


251 


92 


116 


AL 157952 


Homo sapiens 


dJ875K15.1.1 (ets homologous factor (ets- 
domain transcription factor ESE-3A, isoform 1)) 


484 


91 


117 


W 18084 


Homo sapiens 


Human Aurora-2. 


546 


87 
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ID 
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No. 


Species 
— — . 
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Smith- 
Waterman 
Score 


% 

Identity 


1 18 
1 lo 


1 A1 SI 1ft 


Homo sapiens 


cam kinase I 


407 


62 


119 


AJ006710 


Rattus 
norvegicus 


phosphatidylinositol 3-kinase 


627 


93 


120 


AF026954 


Bos taurus 


pyruvate dehydrogenase phosphatase regulatory 
subunit precursor; PDPr 


1646 


94 


121 


S39392 


Homo sapiens 


protein tyrosine phosphatase, PTPase {EC 
3.1.3.48} 


373 


68 


122 


U60805 


Homo sapiens 


oncost atin-M specific receptor beta subunit 


262 


88 


17"5 


I 444U3 


Homo sapiens 


Human truncated tankyrase-1. 


111 


35 


1 74 


U35J 0 / 


caenornaoditi 
s elegans 


contains similarity to C2 domains 


219 


29 


17^ 




Homo sapiens 


guanine nucleotide binding protein beta subunit 
4 


693 


90 


176 


AT5H71 fi£1 


Mus 

museums 


apoptosis signal-regulating kinase 2 


153 


65 


177 


APin^7i n 


Homo sapiens 


concentrative Na+-nucleoside cotransporter 
hCNT3 


807 


97 


1 7R 


JViyiooU 


rtomo sapiens 


protein kinase 


220 


73 


170 


rY177fi7 


xiomo sapiens 


alpna I(J adrenergic receptor isoform 2 


574 


86 


i 


APOAftfVtl 
/VrZvolr43 


xiomo sapiens 


11*1 16b 


496 


67 


131 


AF201734 


Mus 

musculus 


testis specific serine kinase-3 


800 


87 


T?7 


At 1 125 So 


Bos taurus 


diirerentiation enhancing &ctor 1 


159 


74 


133 


AJ278314 


Homo sapiens 


phospholipase C-beta- 1 b 


554 


85 




W 74802 


Homo sapiens 


Human secreted protein encoded by gene 73 
clone HSQEL25. 


1157 


87 




AttUzUiJj 


Homo sapiens 


rancreas-specific gene 


668 


96 


136 


W80408 


Homo sapiens 


A secreted protein encoded by clone dt674 2. 


866 


98 


137 


AC002563 


Homo sapiens 


putative RHO/RAC effector protein; 95% 
similarity to P49205 (PID:g 1345860) 


5041 


99 


138 


Y96736 


Homo sapiens 


PR03434, a novel secreted protein. 


891 


100 


13y 


AJBU24034 


Arabidopsis 
thai i ana 


DNA-damage inducible protein DDI 1 -like 


147 


55 


14U 




Homo sapiens 


Human GTPase regulator GRAF. 


248 


56 


1/11 
141 


I jl j5/ 


Homo sapiens 


Human PLA2 protein. 


125 


46 


1 47 
14x 


ArUyUl 13 


Rattus 
norvegicus 


AMPA receptor binding protein 


623 


93 


1 ^11 
143 


W/004z 


Homo sapiens 


Human RECK cancer-inhibiting protein. 


641 


82 


144 


U87306 


Rattus 
norvegicus 


transmembrane receptor UNC5H2 


578 


84 


145 


AF264014 


Homo sapiens 


scavenger receptor cysteine-rich type 1 protein 
Ml 60 precursor 


727 


92 


1 


W03003 


Homo sapiens 


Human secreted protein 3. 


140 


40 


147 


M96264 


Homo sapiens 


galactose- 1 -phosphate uridyl transferase 


513 


81 


1 /l D 
14o 


L/64UI4 


*— _ t . • * 

Escherichia 

COll 


HrsA 


818 


90 




M.QJJ lo 


Escherichia 
con 


pppupp pnosphohydrolase 


915 


95 


1S0 


AT 1tfV)7Q 


Homo sapiens 


1- _ 1 _ _ j A \ Mt\ _^ * mi* 1 * * 

nomolog to cAMP response element binding and 
beta transducin family proteins 


1261 


99 


1 SI 


AF1 7QCA7 


Homo sapiens 


b Ili20-like kinase 


940 


99 


1S7 


XV?JJ JZ 


rtomo sapiens 


Tumor necrosis factor receptor 1 death domain 
Iigand (clone 3TW). 


392 


61 


1S^ 


API SI S^Q 


Homo sapiens 


CGI-10I protein 


370 


92 


154 


X66957 


Homo sapiens 


hexokinase type 1 


489 


81 


i 


I 1o3jj 


T T « _ _ 

Homo sapiens 


alternatively spliced form 


432 


92 


156 


G00857 


rfrvmn <£ftnif*n<: 


iiuiiidji becrcieu proiein, oni^ iu ri\J. 4y3o. 


349 


78 


157 


AF159455 


Mus 

musculus 


zinc finger protein 


352 


74 


158 


L76191 


Homo sapiens 


interleukin-1 receptor-associated kinase 


537 


76 


159 


AP001743 


Homo sapiens 


putative gene, ankirin like, possible dual 
specifity Ser/Thr/Tyr kinase domain 


670 


98 


160 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


556 


74 


161 


G02885 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6966. 


370 


100 
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No. 
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OJ u * 111 

Waterman 
Score 


Identity 


162 


ry farms' r\ 

Z22968 


Homo sapiens 


Ml JO anugen 


610 

V/l v 


100 


163 


AF181121 


Homo sapiens 


ATP-dependent Ca2+ pump PMR1 


336 


92 


164 


AF055636 


Homo sapiens 


leucine-ncn gnoma-inacuvaieu pruicm jjj w-uk>ui 


455 


94 


165 


AF 160798 


Rattus 
norvegicus 


calcium transporter CaTl 


700 


96 


166 


Y76332 


IT * ~ 

Homo sapiens 


rragment ox numan secreiea pruicin aicuucu uy 

nana T *f 

gene jo. 


^27 


45 


167 


Y48607 


Homo sapiens 


Human breast tumour-associated protein 68. 


1072 


99 


168 


AB020741 


Mus 

musculus 


NIK.- related Kinase 


197 


43 


169 


AF252293 


Homo sapiens 


PAR3 




44 


170 


U59429 


Cricetinae 
gen. sp. 


diacylgryceroi Kinase eia 


481 


82 


171 


AF035268 


Homo sapiens 


pnospnatidylsenne-speciiic pnospnoupase ai 




49 


172 


AF127085 


Mus 

musculus 


semaphorin cytoplasmic domain-associated 
protein 3B 


507 


82 


173 


Y27918 


Homo sapiens 


Human secreted protein encoded by gene No. 
123. 






174 


G02979 


Homo sapiens 


Human secreted protein, i>JbQ 1U NU. /uwj. 


J JO 


07 


175 


U36488 


Mus 

musculus 


embryonic stem cell phosphatase 

— , , 


168 


55 


176 


W95629 


Homo sapiens 


Homo sapiens secreted protein gene clone 
gml9o 4. 


1 AOO 
1 KjLL 


inn 


— rs- 

177 


AF289023 


Homo sapiens 


iormiminotransrerase cycioaeaminase iorrn u 




0^ 

*rJ 


178 


X04936 


Homo sapiens 


T-cell receptor alpna-cnain (41 j is zna oase in 
codon) 


71 fl 
/ 1 U 


00 


179 


AF 127481 


Homo sapiens 


non-ocogenic Rho GTPase-specific GTP 
exchange factor 


175 


80 


180 


G00978 


Homo sapiens 


Human secretec protein, inu. juj?. 


51 7 


94 


181 


Y66645 


T T — 

Homo sapiens . 


Membrane- bouna protein rKwi j iu. 


U/ I 


06 


182 


AF 110640 


Homo sapiens 


orphan seven-transmembrane receptor 


862 


100 


183 


AB020854 


Bos taurus 


orphan transporter snort splicing variani 


7ftft 
/ oo 




184 


AF 169691 


Homo sapiens 


cadherin-hke protein VRo 


^75 

j § j 


1R 


185 


AF126372 


Homo sapiens 


thyrotropin-releasing hormone degrading 
ectoenzyme 




00 
yy 


186 


L20966 


T T 

Homo sapiens 


phosphodiesterase 






187 


G02920 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7001. 


254 


93 


188 


Y94918 


Homo sapiens 


Human secreted protein clone dd5U4_Jo protein 
sequence SEQ ID NO:42. 




OS 


189 


Y66713 


Homo sapiens 


Membrane-bound protein PKUliUy. 


f%QA 


inn 


190 


G03244 


Homo sapiens 


Human secreted protein, z>hKi iu NU. /^^d. 


jj 1 


71 


191 


U36771 


Rattus 
norvegicus 


sn-glycerol 3-phosphate acyltransferase 


707 


92 


192 


R05935 


Homo sapiens 


Secreted GPIIb subunit of multiple subunit 
polypeptide (M5>r)Orllo-lJJa. 


157 


72 


193 


M92084 


Theileria 
parva 


casein kinase II alpha subunit 


364 


50 


194 


Y66645 


Homo sapiens 


Membrane-bound protein PRO1310. 


448 


90 


195 


W95631 


Homo sapiens 


Homo sapiens secreted protein gene clone 
hj968 2. 


382 


49 


196 


AF255614 


Rattus 
norvegicus 


scaffolding protein SLIPR 


ooU 


00 


197 


AC021640 


Arabia opsis 
thaliana 


putative pnospnatidate pnospnonyaroiase 


inn 


41 


198 


AF073967 


Mus 

musculus 
domes ticus 


olfactory receptor 


316 


43 


1QO 

17/ 


W V/ 1 / Jw 


r-Tnmn sanieni 


Human G-orotein receDtor HPRAJ70. 


617 


98 


200 


AF1 17948 


Homo sapiens 


pancreas-enriched phospholipase C 


625 


89 


201 


AF128625 


Homo sapiens 


CDC42-binding protein kinase beta 


636 


94 


202 


AF117946 


Homo sapiens 


Link guanine nucleotide exchange factor II 


1303 


100 


203 


Y53021 


Homo sapiens 


Human secreted protein clone qc646_l protein 
sequence SEQ ID NO:48. 


701 


99 


204 


AF227968 


Homo sapiens 


SH2-B beta signaling protein 


182 


79 


205 


S81752 


Homo sapiens 


DPH2L?=candidate tumor suppressor gene 


375 


100 
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SEQ 

ID 

NO: 


J Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 








{ovarian cancer critical region of deletion} 






206 


U18315 


Sus scrofa 


parathyroid receptor 


122 


60 


207 


AF255342 


Homo sapiens 


putative pheromone receptor V1RL1 long form 


170 


96 


208 


S52051 


Rattus sp. 


neurotransmitter transporter 


715 


94 


209 


W63683 


Homo sapiens 


Human secreted protein 3. 


840 


99 


210 


D79992 


Homo sapiens 


similar to Drosophila photoreceptor cell-specific 
protein, calphotin. 


541 


82 


211 


AF 117948 


Homo sapiens 


pancreas-enriched phospholipase C 


1348 


99 


212 


U81035 


Rattus 
norvegicus 


ankyrin binding cell adhesion molecule 
neurofascin 


471 


69 


213 


AF 154846 


Homo sapiens 


zinc finger protein 


798 


56 


214 


AF 102777 


Mus 

musculus 


FYVE finger-containing phosphoinositide kinase 


933 


93 


215 


AL 163 303 


Homo sapiens 


putative gene containing transmembrane domain 


523 


89 


216 


U26595 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 


563 


78 


217 


G04095 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8176. 


644 


98 


218 


X75756 


Homo sapiens 


protein kinase C mu 


314 


81 


219 


Y66723 


Homo sapiens 


Membrane-bound protein PRO 1 100. 


770 


98 


220 


D88577 


Mus 

musculus 


Kupffer cell receptor 


567 


40 


221 


AF258465 


Homo sapiens 


OTRPC4 


853 


100 


222 


AF021935 


Rattus 
norvegicus 


mytonic dystrophy kinase-related Cdc42-binding 
kinase 


636 


96 


223 


AL 136527 


Homo sapiens 


bA215B13.1 (A kinase (PRKA) anchor protein 
11) 


693 


100 


224 


AB032417 


Homo sapiens 


WNT receptor Frizzled-4 


690 


99 


225 


AF03043O 


Mus 

musculus 


semaphorin Via 


703 


68 


226 


AE000218 


Escherichia 
coli 


putative dihydroxyacetone kinase (EC 2.7. 1.2) 


297 


39 


227 


AF302150 


Homo sapiens 


phosphoinositol 3-phosphate-binding protein-2 


2080 


100 


228 


AB024573 


Mus 

musculus 


GTP-binding like protein 2 


265 


88 


229 


AF122924 


Xenopus 
laevis 


Wnt inhibitory factor- 1 


316 


40 


230 


G03205 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7286. 


229 


100 


231 


X98260 


Homo sapiens 


M-phase phosphoprotein 1 1 


265 


92 


232 


R92754 


Homo sapiens 


Human growth differentiation factor- 12. 


682 


95 


233 


R7511I 


Homo sapiens 


Glycosyl-phosphatidylinositol-specific 
phospholipase-D. 


290 


100 


234 


W69431 


Homo sapiens 


Human secreted protein cwl233_3. 


235 


97 


235 


Y08686 


Homo sapiens 


serine palmitoyltransferase, subunit II 


859 


81 


236 


AF 118275 


Homo sapiens 


atrophin-related protein ARP 


117 


37 


237 


X81466 


Mus 

musculus 


Embryo Brain Kinase 


460 


62 


238 


U64857 


Caenorhabditi 
s elegans 


similar to the BFJl/Kunitz family of inhibitors; 
most similar to tissue factor pathway inhibitor 
precursor (TFPI) 


284 


33 


239 


AJ250840 


Mus 

musculus 


serine/threonine protein kinase 


739 


63 


240 


AJ223472 


Mus 

musculus 


transcription elongation factor TFIIS.h 


222 


38 


241 


Y94906 


Homo sapiens 


Human secreted protein clone rb649 3 protein 
sequence SEQ ID NO: 1 8. 


353 


52 


242 


AF 169301 


Homo sapiens 


Na-f /sulfate cotransporter SUT-1 


591 


99 


243 


L22022 


Rattus 
norvegicus 


orphan transporter v7-3 


667 


93 


244 


AF016191 


Rattus 
norvegicus 


potassium channel 


1043 


98 


245 


AF097366 


Homo sapiens 


cone sodium-calcium potassium exchanger 


645 


98 


246 


Y29868 


Homo sapiens 


Human secreted protein clone pp325 9. 


497 


98 


247 


AF 180475 


Homo sapiens 


Not4-Np 


188 


83 


248 


Y 17227 


Homo sapiens 


Human secreted protein (clone yal-1). 


690 


99 


249 


AF250910 


Manduca 


death- associated small cytoplasmic leu cine-rich 


182 


31 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


omitn- 
w atennan 
Score 


o/ 
/o 

laentiry 






ScXLa 


protein ov^i^r 








AP1G77^£ 


Jvaposi s 
sarcoma- 

hprnpsvinis 


r\rfii 


1 "KA 


id. 


251 


AB022694 


Homo sapiens 


MOK protein kinase 


209 


83 


ZJZ 










100 




T j4£R 1 S 


musculus 




Z J 1 


fn 

o / 








rluiilaJl al/iU AClidillg lUHIC CilaJIiICl. 


171 


R7 

OZ 




AP07fMVv£ 

/ITU/WOO 


nniicftiliic 


Vw- 1 Ll UIl rv AJJlaoC 


1 -£U 1 


OR 


256 


G02491 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6572. 


460 


100 


7^1 

z j / 




\jiy lioj ogub 


j Jl USpiJ UJ jpdoC 


36R 


RO 


7SR 




nuiuu sapiens 


riUIHall OalUlUIIl HJdnilCJ JUL/ Jf \^rvr\V_x"Z.. 


1 RS7 

1 OJ / 




259 


AJ222968 


Mus 


L-periaxin 


430 


72 


260 


AJ250839 


Homo sapiens 


serine/threonine protein kinase 


861 


100 


zol 




nomo sapiens 


AMr-activatea protein Kinase gamma j suounn 


/JO 


QQ 


zox 


At 14 USD 


Kanus 
norvegicus 


QT IT T 


1 Q« 
1 ?5 


4U 


oat 
zuj 


AJ"U/ZdJ7 


Homo sapiens 


neuropil in-z^au j 




Oz 


zo4 


At 1o04V / 


Homo sapiens 


Ig superfamily receptor LNIR precursor 


35 / 




265 


Y44662 


Homo sapiens 


Human 14273 G-protein coupled receptor 


636 


99 


266 


U27269 


Mus 

musculus 


sodium glucose cotransporter 


204 


56 


267 


AF124491 


Homo sapiens 


ARF GTPase-activating protein GIT2 


159 


75 


268 


AF 127389 


Rattus 
norvegicus 


putative taste receptor TR1 


^ Art 

209 


39 




Xyozyo 


Homo sapi ens- 


ubiquitin hydrolase 


215 


95 


270 


X78482 


Streptococcus 
pyogenes 


Fc-gamma receptor 


129 


26 


ii 1 
271 


AB009883 


Nicotiana 
tabacum 


KED 


1 Art 

109 


26 


272 


AF 137367 


Mus 

musculus 


VPS 10 domain receptor protein SORCS 


899 


An 

97 


115 


LJ4yjo 


Rattus 
norvegicus 


ionotropic glutamate receptor 


46U 


00 


274 


AL022724 


Homo sapiens 


dJ4 13H6. 1 .1 (hamster Androgen-dependent 
express eu rrotein Liivb ru i ai 1 vt protein) 

1 1 Crt IA FTTl 1 \ 

^IbOlOI Ul 1 ) 


188 


74 


77^ 


ivi J J J 




uuimjjuii-uAJiijugaLiiig, oixvnjvii iiaui cnxyiuc 
APOI 1 ON 




QA 


276 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


148 


56 


711 


IjtVJ OU 


riuiTiu aapieiia 


myruiu rcccpiur micf actor 




Dl 


278 


AB046851 


Homo sapiens 


KIAA1631 protein 


283 


96 


z /y 




rNXaDlUOpSlS 

thaliana 


contains rr|vu\;oi/ cuicaryoiic protein Kinase 
domain. 


1 ^7 




7R0 
ZoU 


ivioJ / JO 


riomo sapiens 


proiein-iyi osuic pnospnaiase 


191 


11 

tJ 


ZO 1 


AVA7/11Q7 


noiiio bapienb 


ujinajiicu proiciu pruuuci 




yi 


7R7 


i\r 141 jzo 


UAmA c r»f\ tone* 

riorno sapiens 


iviN/\ neucase riL/Di lj 


407 


ftA 


OCT 
ZoJ 




Mus 

muscuiu& 


u i o-oomain transcripiionai repressor rui 




fo 




l Zyj jO 


xiomo Sapiens 


nuiriaii ^ccreicu pruiein ciune w / jo^_z ai icrnaic 
rftaflino frarnp nrntpin 

IvaUIUg llulUv L/lVrlwill* 




mo 

1UO 


285 


Y73402 


Homo sapiens 


Human secreted protein clone yc25 1 protein 
sequence SEQE>NO:26. 


300 


90 


286 


AF016411 


Homo sapiens 


KCNA3.1B 


137 


100 


— ~ 

287 


W89253 


Homo sapiens 


Human ALP. 


688 


97 


288 


AF 112886 


Bos taurus 


differentiation enhancing factor 1 


750 


96 


289 


AF113131 


Homo sapiens 


host cell factor homolog LCP 


367 


44 


290 


U52111 


Homo sapiens 


plexin-related protein 


698 


100 


291 


AF026504 


Rattus 


SPA-1 like protein pI294 


603 


89 



no 



WO 01/57188 



PCT/US01/03800 



SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 






norvegicus 








292 


AP102854 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


124 


53 


293 


X99211 


Drosophila 
meJanogaster 


ubiquitin-specific protease 


143 


38 


294 


Y94943 


Homo sapiens 


Human secreted protein clone ytI4 1 protein 
sequence SEQ ID NO:92. 


185 


94 


295 


Y94890 


Homo sapiens 


Human protein clone HP02798. 


108 


59 


296 


AFO 19767 


Homo sapiens 


zinc finger protein 


154 


96 


297 


Y28568 


Homo sapiens 


Secreted peptide clone bd577_l. 


568 


84 


298 


Y94943 


Homo sapiens 


Human secreted protein clone ytl4_l protein 
sequence SEQ ID N0.92. 


182 


97 


299 


B08906 


Homo sapiens 


Human secreted protein sequence encoded by 
genel6SEQIDNO:63. 


605 


69 


300 


R58890 


Homo sapiens 


Human-32 cadherin- related molecule. 


212 


97 


301 


AF022859 


Homo sapiens 


neuropilin-2(a0) 


277 


100 


302 


Y71124 


Homo sapiens 


Human mitogenic regulator duox2. 


716 


97 


303 


Y44297 


Homo sapiens 


Human receptor tyrosine kinase. 


228 


97 


304 


D32050 


Homo sapiens 


alanyl-tRNA synthetase 


192 


80 


305 


U43586 


Homo sapiens 


protein kinase related to Raf protein kinases; 
Method: conceptual translation supplied by 
author 


428 


72 


306 


R54872 


Homo sapiens 


Human H13 viral receptor mutant 4. 


280 


95 


307 


D78572 


Mus 

musculus 


membrane glycoprotein 


199 


41 


308 


AF255614 


Rattus 
norvegicus 


scaffolding protein SLIPR 


639 


88 


309 


S79463 


Mus sp. 


semaphorin homolog=M-Sema F 


162 


89 


310 


AF178941 


Homo sapiens 


ATP-binding cassette sub-family A member 2 


736 


100 


311 


U03413 


Dictyostelium 
discoideum 


calcium binding protein 


151 


36 


312 


Y87347 


Homo sapiens 


Human signal peptide containing protein HSPP- 
124 SEQ IDNO:124. 


744 


100 


313 


Z97055 


Homo sapiens 


dJ388M5.4 (putative GS2 like protein) 


789 - 


99 


314 


AC004010 


Homo sapiens 


similar to Leucine-rich transmembrane proteins; 
44% similarity to U42767 (PID:gl736918) 


197 


38 


315 


AL021392 


Homo sapiens 


dJ439F8.2 (supported by GEN SCAN and 
GENEWISE) 


278 


38 


316 


U70209 


Mus 

musculus 


polycystic kidney disease 1 protein 


165 


38 


317 


AF109643 


Rattus 
norvegicus 


coxsackie-adenovirus-receptor homolog 


223 


38 


318 


AF104923 


Homo sapiens 


putative transcription factor 


138 


84 


319 


AF1 00287 


Trypanosoma 
vivax 


activated protein kinase C receptor homolog 


141 


38 


320 


G00588 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4669. 


125 


51 


321 


Y21591 


Homo sapiens 


Human secreted protein (clone CC332-33). 


459 


97 


322 


D26070 


Homo sapiens 


human type 1 inositol 1,4,5-trisphosphate 
receptor 


232 


97 


323 


Y27918 


Homo sapiens 


Human secreted protein encoded by gene No. 
123. 


306 


88 


324 


AF010144 


Homo sapiens 


neuronal thread protein AD7c-NTP 


209 


70 


325 


M19650 


Homo sapiens 


W-cyclic-micleotide 3'-phosphodiesterase (EC 
3.1.4.37) 


214 


97 


326 


W80396 


Homo sapiens 


A secreted protein encoded by clone bp646_10. 


140 


70 


327 


X75756 


Homo sapiens 


protein kinase C mu 


540 


78 


328 


G02292 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6373. 


721 


99 


329 


AF168990 


Homo sapiens 


putative GTP-binding protein 


877 


99 


330 


S67984 


Homo sapiens 


anti-HIV gpl20 antibody heavy chain variable 
region 


581 


80 


331 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA -19 to 4525) 


2823 


98 


332 


Y87330 


Homo sapiens 


Human signal peptide containing protein HSPP- 
107 SEQ ID NO: 107. 


1127 


100 


333 


Y28503 


Homo sapiens 


HGFH3 Human Growth Factor Homologue 3. 


320 


98 


334 


AC002563 


Homo sapiens 


putative RHO/RAC effector protein; 95% 


327 


93 



111 
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SEQ 
ID 

NO: 
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No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 








similarity to P49205 (PlD:gl 345860) 






335 


Y87347 


Homo sapiens 


Human signal peptide containing protein HSPP- 
124 SEQIDN0.124. 


1111 


67 


336 


AF006466 


Mus 

musculus 


lymphocyte specific forrnin related protein 


193 


75 


337 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain enzyme 
APOLLON 


632 


97 


338 


Y 13443 


Homo sapiens 


Amino acid sequence of hSlo3-2. 


516 


100 


339 


Y07637 


Homo sapiens 


putative GABA-gated chloride channel 


189 


100 


340 


Y05734 


Homo sapiens 


Human Grb7 effector 2.2412 protein. 


2156 


99 


341 


AE000497 


Escherichia 
coli 


L-idonate transcriptional regulator 


928 


98 


342 


D90855 


Escherichia 
coli 


glycerol-3-phosphate dehydrogenase (EC 
LI. 99.5) chain A, anaerobic 


769 


99 


343 


D85613 


Escherichia 
coli 


membrane component 


399 


100 


344 


M93239 


Escherichia 
coli 


transmembrane protein 


232 


100 


345 


M60177 


Escherichia 
coli 


enterobactin 


759 


99 


346 


D90699 


Escherichia 
coli 


Sensor protein copS (EC 2.7.3.-). 


638 


97 


347 


D90843 


Escherichia 
coli 


CapB protein. 


552 


100 


348 


Ml 3422 


Escherichia 
coli 


49 led protein 


1193 


96 


349 


L10328 


Escherichia 
coli 


similar to drug resistance translocases 


340 


90 


350 


X69942 


Mus 

musculus 


enhancer-trap-locus-1 


560 


82 


351 


AF239613 


Homo sapiens 


apamin-sensitive small -conductance Ca2+- 
activated potassium channel 


463 


80 


352 


D90777 


Escherichia 
coli 


3-hydroxybutyryl-CoA dehydrogenase (EC 
1.1.1.157) (b- hydroxybutyryl-CoA 
dehydrogenase) (BhbD). 


577 


100 


353 


D90863 


Escherichia 
coli 


similar to 


311 


98 


354 


Y52386 


Homo sapiens 


Human transmembrane protein HP02000. 


133 


58 


355 


Y31645 


Homo sapiens 


Human transport-associated protein-7 (TRANP- 
7)- 


482 


55 


356 


Y58637 


Homo sapiens 


Protein regulating gene expression PRGE-30. 


119 


51 


357 


AF119226 


Homo sapiens 


dual-specificity tyrosine phosphatase YVH1 


1788 


100 


358 


Y87219 


Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:258. 


165 


4 AA 

100 


359 


J00132 


Homo sapiens 


beta-fibrinogen 


233 


93 


360 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


128 


70 


361 


R28916 


Homo sapiens 


Type III procollagen (prior art). 


108 


A f\ 

40 


362 


U16655 


Rattus 
norvegicus 


phospholipase C delta-4 


649 


65 


363 


G03119 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7200. 


95 


A f\ 

42 


364 


U47276 


Gallus gatlus 


chicken brain factor-2 


104 


34 


365 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


183 


65 


366 


G04091 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8172. 


118 


46 


367 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


mm m> 

75 


368 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related protein) 


3387 


AA 

99 


369 


U70932 


Peromyscus 
leucopus 


reverse transcriptase 


92 


59 


370 


X86400 


Homo sapiens 


gamma subunit of sodium potassium AiFase 
like 




TX 

15 


371 


G03172 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7253. 


165 


56 


372 


U49974 


Homo sapiens 


mariner transposase 


257 


55 


373 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA -1 9 to 4525) 


21193 


99 


374 


AF234765 


Rattus 
norvegicus 


serine-arginine-rich splicing regulatory protein 
SRRP86 


1182 


78 


375 


U49974 


Homo sapiens 


mariner transposase 


172 


55 



112 



WO 01/57188 
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i*s j>^v 

SEQ 


Accession 


Species 


Description 


Smith- 


% 


ID 

NO: 


No. 


• 




Waterman 
Score 


Identity 


376 


GO 1984 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6065. 


221 


67 


ill 


G00669 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4750. 


600 


100 


378 


X5 2574 


Mus 

musculus 


O TP binding protein 


1456 


91 






T T - 

Homo sapiens 


Anti-HI V Fab tat3 1 light chain. 


68 


37 


ISO 


JU47 /4 


Homo sapiens 


alpha-2 type XI collagen 


125 


37 


1 


AJoUOz4U5 


Homo sapiens 


LAK-4p 


530 


43 




U04o3U 


Dictyostelium 
discoideum 


protein tyrosine kinase 


115 


44 






Homo sapiens 


Human secreted protein, SEQ ID NO: 6997. 


618 


98 




nni i qa 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5275. 


617 


93 


"*ftS 


/\JZ4 JoLL 


xiomo sapiens 


type I transmembrane receptor 


4560 


100 


3ftfi 


UoOzf f 4 


Homo sapiens 


VTA A AIOA " — " 

K.1AA0220 


2148 


98 


JO / 


VjU3^U5 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


142 


50 


388 


G04072 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8153. 


99 


59 




MI2140 


Homo sapiens 


envelope protein 


197 


51 




AJ293309 


Homo sapiens 


NHP2 protein 


461 


77 


391 


Y42751 


Homo sapiens 


Human calcium binding protein 2 (CaBP-2). 


181 


94 


392 


W48351 


Homo sapiens 


Human breast cancer related protein BCRB2. 


241 


66 


393 


X / ^ £ A A f\ 

Y 14442 


Homo sapiens 


olfactory receptor protein 


339 


54 


394 


W85607 


Homo sapiens 


Secreted protein clone da228 6. 


957 


100 


395 


Y76332 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 38. 


171 


34 


396 


G03930 


Homo sapiens 


Human secreted protein, SEQ ID NO: 801 1. 


250 


100 


397 


AB032904 


Hylobates 
syndactylus 


dopamine receptor D4 


105 


35 


A A A 

398 


AJ007798 


Homo sapiens 


stromal antigen 3, (STAG3) 


861 


-85 


399 


Y91405 


Homo sapiens 


Human secreted protein sequence encoded by 
gcne2SEQIDNO:126. 


1047 


92 


400 


Y29861 


Homo sapiens 


Human secreted protein clone cb98_4. 


162 


37 


401 


D87002 


Homo sapiens 


similar to rat integral membrane glycoprotein; 
accession number Z21513. 


527 


78 


402 


AF 100754 


Homo sapiens 


ancient ubiquitous protein AUP1 isoform 


853 


95 


403 


X74904 


Gallus galius 


alpha-2-macroglobulin receptor 


258 


60 


404 


AF075462 


Mus 

musculus 


ADP-ribosylation factor-directed GTPase 
activating protein isoform b 


545 


89 


405 


X92887 


Human 
endogenous 
retrovirus K 


pol/env 


162 


30 


406 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 hDRR4. 


325 


72 


407 


AK022626 


Homo sapiens 


unnamed protein product 


2833 


99 


408 


LI 3802 


Homo sapiens 


ribosmal protein small subunit 


264 


92 


409 


Y91600 


Homo sapiens 


Human secreted protein sequence encoded by 
gene9SEQD0NO:273. 


1788 


89 


410 


W88745 


Homo sapiens 


Secreted protein encoded by gene 30 clone 
HTSEV09. 


2004 


99 


411 


AB 043953 


Mus 

musculus 


Chat-H 


2628 


82 


A t*\ 

412 


Y86233 


Homo sapiens 


Human secreted protein HNTMX29, SEQ ID 

"m. T^*\ % A t% 

NO: 148. 


1014 


92 


413 


U 10542 


Pan 

A 1—1. 4 

troglodytes 


MHC class I A 


265 


71 


A 1 A 

414 


AC1 C<AOl 

At 155097 


Homo sapiens 


NY-REN- 7 antigen 


850 


95 


415 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


88 


48 


416 


Y5791 1 


Homo sapiens 


Human transmembrane protein HTMPN-35. 


266 


89 


417 


W27651 


Homo sapiens 


Secreted protein AT205. 


481 


60 


41 ft 


I /Ooo4 


xiomo sapiens 


Retinoblastoma binding protein-7sequence. 


3077 


87 


419 


AF255559 


Notothenia 
coriiceps 


alpha tubulin 


289 


68 


420 


GO 1984 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6065. 


209 


74 


421 


AL 109827 


Homo sapiens 


dJ309K20.2 (acrosomal protein ACR55 (similar 
to rat sperm antigen 4 (SPAG4))) 


1446 


96 


422 


AC008075 


A rabid ops Is 
thaliana 


F24J5.4 


112 


35 



113 
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PCTAJS01/03800 





Accession 


opCCICS 




Smith- 
oiD jui 


/o 


11 J 


1NU. 






Wfttprman 


IHpntitv 

Villi IJr 










Score 






AF931705 

rvi x /US' 


Mom a cflnipni 


A hi cn-rpnrf^^nr 1 

ill l-wpl W>yui i 


1090 


100 

A V V 


494 


AF934887 


Vlnmn Cflnipn^ 

.IIUIJIU 3aLrltllO 




6268 


97 


49*1 


Y3S942 


Homo cnnipnc 


PvtpnHpH human vrrptrrf nrotetn ^Cfiucnce SKO 


1961 

A \J A 


99 








ID NO 191 






476 


AB009288 


Homo ^Ani(*n<; 


1 ^ vUUIIlv 


635 


98 


497 


LI 2392 


X-fomrv <iariif*n < J 


HiintiriP+ori^ nrot-ein 


16080 

A W VV 


99 


498 


Y94990 


Homo ennipne 
nuillv oajJitllJ 


Human Q-prrrtpH nrotein vb21 1 SEO ID NO*20 


768 


98 


499 


AJ9Q3S73 


Unmn <iiiT1tf*n < v 


yinr finoPT nrntPin f^C7Jinne 


542 


87 


410 


Y8444 1 

I Oil f 1 


Homo -^nipn^ 


Aminn acid ^enuftnee of a human RTsI A- 


2074 


100 








fl<.^ncfafpc) nrotein 

duOvvlQLlfU Ul ULV11J* 






431 


G02850 


Homo saniens 


Human secreted oroteiiL SEO ID NO* 6931 

J. A\i>*->U4& JVU WVvW Ul \J L villa l^fAJ \£ A.M J * * * V^+S K * 


723 


95 


432 


G04067 


Homo ^flnien*; 


Human secreted orotein. SEO ID NO' 8148 


73 j 


42 


433 

T J J 


AF 159296 


T vconersicon 


extensin-like Drotein 

W^V LV14JJJ 1 §. r' ' * 


613 


48 






cscu) cn turn 








434 


W48351 


Homo saoiens 


Human breast cancer related Drotein BCRB2 


135 


44 


435 


X73874 


Homo ^ianien^ 


Dhosnhorvlflse kinase 


3442 


97 


436 


AP161426 


Homo saniens 


HSPC308 


268 


74 


437 


Y30812 


Hnmn ^flnipn^ 


Human <:pcrpfcd nrntftin en ended from pene 2 

J. iUUl lLLm JvvJ t^vU L/J \J twill vllvvMvU HUiJJ m*^* * 


1055 

A \r *f mJ 


52 


438 


G03798 


Homo ^flnif*n*i 


Human secreted Drotein SEO ID NO* 7879 


168 


56 


439 


XI 4766 


Hnmn ^nnieni 


GAB A-A rpcentor aloha 1 suhunit 


2294 


96 


440 


X02344 


Hnmn Qflr^i^nQ 


HMfi-tiiKiiltn 

Ut;lu ~I.ULf Ulill 


311 


95 


441 


AF168418 


Hnmn wnipn^ 


fiplivfthnp ^{pnal cointeprfttor 1 


1882 


100 


449 

44 j£ 


I 1 1 £79 
Jul lO/z 


nuinu ■jdpiciib 


Ainu inigci pruicm 


79 S 


S4 


44 j 


nn^9m 


nuino 3apiCIlb 


Uiinifln Qprr*»t#»H nmtctn ^IFO Tn TslO* 79 R4 
riuiiian dC^rcicu pruicui^ ojz>\^ ils /^oi. 




96 


444 
444 


AS9140 

AJiltU 


UlilUClitlllCU 


HI TMAN "MF>T? 


24S1 


1 UU 






rioiTio Sapiens 


rydjjouinc rcucp lui i. 






446 
440 


ApI 1 A719 
rVT 1 I O / 1Z 


nomo sapiens 


r ivuz, /jo 


297 


40 


447 
44 l 


AF94S4A7 
t\r Jrr J44 / 


1— I rv-i /~v ponian c 

riuiliu >apicll5 


^pillilHUolilC KulaoC type ^ loi/IUillI 


576 


Q9 


445 


API 33ftfiA 


riomo Sapiens 


mem Drane , -iy pe senne proicase i 




Q4 


A AO 


T TQ710< 


Kanus 


iransmemuranc receptor uin*^. 5n i 


Rl 7 
ol / 


Q1 






nurvcgiciis 








43U 




iiOirio Sapiens 


j/\ w i-reiaiea protein iviivvi i/\ long laoiorm 


4JO& 


yy 


4 D I 




no ill o iapjens 


IV J 1 DO J 1 


3 

j 1 0 


69 


/t *9 

4J-6 




riomo Sapiens 


granule memDrane protein- 14U 


AC A 
4D4 


7"* 


45J 




Homo sapiens 


intelectin 




Sift 
00 


A *A 
454 




Homo sapiens 


Unman Cf*f-r-otr>rA nmtoin CPA IPt TOO* /t OQQ 

riuman secretea protein, ocx^ iu inu. 4yyy. 


9^ 


fil 
0 1 


455 


Y22W4 


Homo sapiens 


Human cytokine inducible regulatory protein-I 


1 O'l 

Iy2 


an 

Of 














AS& 


J jD/VJ 


riomo sapiens 


rragment oi numan secreteo protein encooea Dy 




Aft 
4v 








gene o^. 






457 


Ny 1,525 


Homo sapiens 


DNA encoding human growth hormone receptor. 




yc> 


458 


Ml ? 155 


Plasmodium 


S- antigen precursor 


t 1 0 
1 1U 


Jo 






iaiciparum 








4sy 


vi unn 


Homo sapiens 


Ammo acia sequence oi protein rKUz j /. 




"no * 


40U 


Y uzoy j 


riomo sapiens 


riuman secreteo protein encoocu Dy gene 44 


1 AQ 

i4y 


4j 








A | nnA l_TTT\ A TY51 

clone m 1 jjajl*^. 






4ol 


VI AAQn 


; 

Homo sapiens 


Fragment of human secreted protein encoded by 


1 CA 
1 o4 


S4 
D4 








gene 1 / . 






4£9 
40^ 


I JJUUJ 


riumo sapiens 


riujnajj secreicu proiein cionc pin /4y_o pruiciii 




47 

4 / 








sequence iu inw.io. 






40j 


YR40AA 


r— : 

l nu cum 


tow inoiecu loi weigni giuiemn 


1 HQ 


33 






dCoLI VUIII 








404 


OOl O 
W lyyiy 


riomo sapiens 


riuman Nsr-i (.Kinase suppressor 01 Kasj. 


1 /oi 


AS 
OJ 




API 80*7/^ 
/IT 1 oy /Oh 


1VJUS 


aipHarDSla ItyQlOiMSS-l 




SO 






mils cuius 








466 


U93569 


Homo crtnipn 1 ? 

Hallux/ *>ULrI^ittA> 




101 


30 


467 


Y41528 


Homo sapiens 


Fragment of human secreted protein encoded by 


1172 


99 








gene 77. 







468 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


149 


52 


469 


AJ000008 


Homo sapiens 


PI3-kinase 


5832 


97 


470 


X70922 


Mus 


neurotoxin homologue 


118 


47 






musculus 








471 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


198 


75 


472 


Y36705 


Homo sapiens 


Fragment of human secreted protein encoded by 


72 


57 
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% 

Identity 








gene 62. 






473 


G02313 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6394. 


328 


100 


474 


Y07007 


Homo sapiens 


Breast cancer associated antigen precursor 
sequence. 


1013 


97 


475 


W93254 


Homo sapiens 


Human ESRP1 protein. 


943 


80 


476 


W48351 


Homo sapiens 


Human breast cancer related protein BCRB2. 


236 


65 


477 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


202 


60 


478 


GO! 870 


Homo sapiens 


Human secreted protein, SEQ ID NO; 5951. 


267 


100 


479 


AF102777 


Mus 

musculus 


FYVE finger-containing phosphoinositide kinase 


3427 | 92 


480 


G03052 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7133. 


123 | 53 


481 


W8770I 


Homo sapiens 


A human membrane fusion protein designated 
SYTAX1. 


221 


77 


482 


G03119 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7200. 


131 


39 


483 


AF210651 


Homo sapiens 


NAG 18 


124 1 59 


484 


AF010I44 


Homo sapiens 


neuronal thread protein AD7c-NTP 


343 j 50 


485 


G00637 


Homo sapiens 


Human secreted protein, SEQ ID NO: 47 18. 


129 ! 70 


486 


U15174 


Homo sapiens 


BCL2/adenovirus E1B 19kD -interacting protein 
3 


149 


73 


487 


Y76167 


Homo sapiens 


Human secreted protein encoded by gene 44. 


627 


100 


488 


AJ275213 


Homo sapiens 


stabilm-1 


1244 


91 


489 


G03798 


Homo sapiens 

* 


Human secreted protein, SEQ ID NO: 7879. 


313 


65 


490 


L12392 


Homo sapiens 


Huntington's Disease protein 


16081 


100 


491 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


197 


66 


^ 49 2 


J03799 


Homo sapiens 


laminin-binding protein 


275t 


70 


| 493 

i 


U15174 


Homo sapiens 


BCL2/adenovirus E1B I9JcD-interacting protein 
3 


128 


41 


494 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


197 


67 


495 


AC005175 


Homo sapiens 


R31449_3 


889 


94 


496 


G03786 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7867. 


229 


61 


497 


AB030237 


Canis 
fam ilia/is 


D4 dopamine receptor 


90 


48 


498 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


228 


65 


499 


U70935 


Peromyscus 
maniculatus 


reverse transcriptase 


213 


52 


500 


U48508 


Homo sapiens 


skeletal muscle ryanodine receptor 


26406 


99 


501 


G03371 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7452. 


105 


58 


502 


AF119851 


Homo sapiens 


PR01722 


156 


62 


503 


AF1 13685 


Homo sapiens 


PRO0974 


116 


50 


504 


U79458 


Homo sapiens 


WW domain binding protein-2 


322 


59 


505 


W29651 


Homo sapiens 

* 


Human secreted protein CD124 3. 

* — 


608 


55 


506 


W85459 


Homo sapiens 


Secreted protein encoded by clone dbl 135_9. 


986 


70 


507 


Y86265 


Homo sapiens 

* 


Human secreted protein HUSXE77, SEQ JD 
NO:180. 


115 


33 


508 


AL160175 


Homo sapiens 


bA243J16.3 (similar to MYLK (myosin, light 
polypeptide kinase)) 


184 


92 


509 


U43360 


Peromyscus 
maniculatus 


reverse transcriptase 


97 


62 


510 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


117 


63 


511 


W79092 


Homo sapiens 


Human secreted protein dn740_3. 


1058 


100 


512 


AF010I44 


Homo sapiens 


neuronal thread protein AD7c-NTP 


205 


64 


513 


AJ 133439 


Homo sapiens 


GRIP1 protein 


2151 


100 


514 


AE003456 


Drosophila 
melanogaster 


CG6393 gene product 


259 


42 . 


515 


217206 


Xenopus 
laevis 


p46XlEg22 


128 


40 


516 


AF104413 


Homo sapiens 


large tumor suppressor 1 


1766 


94 


517 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


92 


40 


518 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


519 


S80864 


Homo sapiens 


cytochrome c-like polypeptide 


318 


50 


520 


X92485 


Plasmodium 
vivax 


pval 


170 


61 



115 



WO 01/57188 



PCTYUS01/03800 



SEQ 
ID 

NO: 



521 



522 



523 



524 



525 



526 



527 



528 



529 



530 



531 



532 



533 



534 



Accession 
No. 



G03790 



AF121857 



G02654 



W88627 



AF 119851 



Y27761 



G02707 



U47924 



G04063 



G03203 



G04067 



G03267 



G03203 



AF068286 



Species 



Description 



Homo sapiens | Human secreted protein, SEQ ID NO: 7871 



Homo sapiens | sorting nexin 7 



Homo sapiens 1 Human secreted protein, SEQ ID NO: 6735. 



Homo sapiens 



Secreted protein encoded by gene 94 clone 
HPMBQ32. 



Homo sapiens | PRO 1722 



Homo sapiens | Human secreted protein encoded by gene No. 47. 



Homo sapiens [ Human secreted protein, SEQ ID NO: 6788 



Homo sapiens \ C8 



Homo sapiens [ Human secreted protein, SEQ ID NO: 8144. 



Homo sapiens | Human secreted protein, SEQ ID NO: 7284. 



Homo sapiens | Human secreted protein, SEQ ID NO: 8148. 



Homo sapiens | Human secreted protein, SEQ ID NO: 7348 



Homo sapiens 1 Human secreted protein, SEQ ID NO: 7284. 



Homo sapiens | HDCMD38P 



Smith- 
Waterman 
Score 



159 



259 



82 



253 



162 



154 



70 



1112 



84 



111 



92 
75 



182 



861 



% 



Identity 



59 
40 



37 



73 



57 



57 



45 



86 



45 



60 



65 



29 



48 



100 



535 



536 



537 
538 



539 



540 



541 



542 



543 



544 



545 



546 



547 



548 



549 



550 



551 



552 



553 



554 



555 



U07707 



Homo sapiens 



G01955 



Homo sapiens 



epidermal growth factor receptor substrate 
Human secreted protein, SEQ ID NO: 6036. 



228 



484 



AF 2 1923 2 



Gallusgallus | qin-induced kinase 



206 



AF135022 



Homo sapiens 1 mediator 



128 



G03267 



Homo sapiens 



AF016430 



Caenorhabditi 
s elegans 



Human secreted protein, SEQ ID NO: 7348. 
contains similarity to a BR-C/TTK domain 



141 



853 



AC003093 



Homo sapiens 



OXYSTEROL-BINDING PROTEIN; 45% 
similarity to P22059 (PlD:gl 29308) 



408 



M29487 



Homo sapiens 1 integrin alpha subunit precursor 



517 



AF102530 



Mus 
musculus 



olfactory receptor F3 



327 



Y73431 



Homo sapiens Human secreted protein clone ybl86_l protein 



386 



AE004833 



Pseudomonas 
aeruginosa 



sequence SEQ ID NO:84. 

probable TonB-dependent receptor 



279 



G03793 



Homo sapiens | Human secreted protein, SEQ ID NO: 7874. 



264 



Y69192 



Homo sapiens 



A human monocyte-macrophage apolipoprotetn 
B receptor protein. 



1772 



Y91493 



Homo sapiens 



Human secreted protein sequence encoded by 
gene 43 SEQ ID NO: 166. 



176 



G01571 



Homo sapiens 



AF044588 



Y29332 



X98330 



Y42782 



AB025258 



AJ010346 



Homo sapiens 



Human secreted protein, SEQ ID NO: 5652. 
protein regulating cytokinesis 1; PRC1 



777 



1953 



Homo sapiens 



Human secreted protein clone pe584_2 protein 
sequence. 



1224 



Homo sapiens | ryanodine receptor 2 



24621 



Homo sapiens [ Human UC Band#331 protein. 



684 



Mus 

musculus 



granuphilin-a 



501 



Homo sapiens | RING-H2 



1468 
538 



60 



75 



53 
100 



59 



39 



66 



JS1_ 
73 



100 



42 



53 



67 



100 



99 



88 



94 



99 



95 



41 



100 
92 



556 



557 



558 



559 



560 



561 



562 



563 



564 



565 



566 



W92388 



Homo sapiens | Human TR-interacting protein S239a. 



API 19851 



Homo sapiens ] PRO 1722 



175 



AF1 17756 



Homo sapiens 



thyroid hormone receptor-associated protein 
complex component TRAP150 



183 



G02872 



Homo sapiens ) Human secreted protein, SEQ ID NO: 6953. 



319 



D86214 



Mus 

musculus 



Ca2+ dependent activator protein for secretion 



1010 



AF 187325 



Canis 
familiaris 



melanoma antigen 



AJ001981 



Homo sapiens | OXA1L 



Z17238 



Rattus 
norvegicus 



glutamate receptor subtype delta- 1 



W30638 



Homo sapiens 



AC005620 



Homo sapiens 



Partial human 7-transmembrane receptor 

HAPOl 67 protein. 

R33590 



1 



Y99358 



Homo sapiens 



Human PR01772 (UNQ834) amino acid 
sequence SEQ ID NO:63. 



287 



2512 



338 



371 



467 



1138 
~R)02 



59 



32 



68 



93 



55 



99 



66 



100 



97 



78 



567 



568 



AL031177 



Homo sapiens | dJ889M153 (novel protein) 



AF151043 



Homo sapiens | HSPC209 



798 



58 



100 



116 



WO 01/57188 



PCT/US01/03800 



SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 

- 


Smith- 
Waterman 
Score 


% 

Identity 


569 


AF097518 


Homo sapiens 


liver-specific transporter 


231 


100 


570 


AB035698 


Homo sapiens 


Misshapen/NIK-related kinase MINK-1 


1532 


100 


571 


Y07096 


Homo sapiens 


Colon cancer associated antigen precursor 
sequence. 


1064 


100 • 


572 


AL031177 


Homo sapiens 


(JJ889M15.3 (novel protein) 


735 


55 


573 


Y66639 


Homo sapiens 


Membrane-bound protein PRO290, 


254 


45 


574 


AB037108 


Homo sapiens 


seven transmembrane domain orphan receptor 


1883 


99 


575 


D43949 


Homo sapiens 


This gene is novel. 


836 


100 


576 


Y48596 


Homo sapiens 


Human breast tumour-associated protein 57. 


108 


50 


577 


G00352 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4433. 


141 


75 


578 


R95913 


Homo sapiens 


Neural thread protein. 


140 


65 


579 


AK025H6 


Homo sapiens 


unnamed protein product 


201 


70 


580 


Y86473 


Homo sapiens 


Human gene 52-encoded protein fragment, SEQ 
IDNO:388. 


77 


70 


581 


AF 196779 


Homo sapiens 


JM 10 protein 


450 


100 


582 


AF 188706 


Homo sapiens 


g20 protein 


330 


98 


583 


AB030234 


Canis 
familiaris 


D4 dopamine receptor 


64 


56 


584 


G02621 


Homo sapiens 


Human secreted protein, SEQ ID NO; 6702. 


345 


90 


585 


AL096828 


Homo sapiens 


dJ963E22.1 (Novel protein similar to NY -REN-2 
Antigen) 


268 


85 


586 


Y30819 


Homo sapiens 


Human secreted protein encoded from gene 9. 


235 


35 


587 


G00357 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4438. 


132 


56 


588 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


182 


79 


589 


AF235017 


Mus 

musculus 


2P1 protein 


764 


80 


590 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


329 


81 


591 


Y30709 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


110 


43 


592 


Y53875 


Homo sapiens 


A human seven transmembrane signal transducer 
polypeptide. 


1369 


92 


593 


Y53051 


Homo sapiens 


Human secreted protein clone ddl I9_4 protein 
sequence SEQ ID NO: 1 08. 


1112 


97 


594 


Y27658 


Homo sapiens 


Human secreted protein encoded by gene No. 92. 


763 


79 


595 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7879. 


156 


58 


596 


AF15U10 


Mus 

musculus 


COP1 protein 


2215 


95 


597 


G03786 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7867. 


157 


65 


598' 


AF192499 


Mus 

musculus 


putative secreted protein ZSIG37 


143 


40 


599 


AF1 19855 


Homo sapiens 


PRO 1 847 


236 


76 


600- 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


212 


73 


601 


Y00295 


Homo sapiens 


Human secreted protein encoded by gene 38. 


567 


88 


602 


AF184971 


Homo sapiens 


class II cytokine receptor ZCYTOR7 


2015 


74 


603 


AF061936 


Homo sapiens 


diacylglycerol kinase iota 


773 


96 


604 


AL096828 


Homo sapiens 


dJ963E22.1 (Novel protein similar to NY-REN-2 
Antigen) 


1333 


93 


605 


AB033106 


Homo sapiens 


KIAA1280 protein 


3915 


100 


606 


X75756 


Homo sapiens 


protein kinase C mu 


3916 


99 


607 


D86983 


Homo sapiens 


similar to D,melanogaster peroxidasin(Ul 1 052) 


5758 


99 


608 


W69341 


Homo sapiens 


Secreted protein of clone CG279J . 


1377 


99 


609 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


339 


82 


610 


Y27868 


Homo sapiens 


Human secreted protein encoded by gene No. 
107. 


116 


62 


611 


AF202636 


Homo sapiens 


angiopoietin-like protein PP1 158 


2164 


100 


612 


AF090944 


Homo sapiens 


PRO0663 


218 


82 


613 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


195 


59 


614 


M87053 


Rattus 
norvegicus 


lens membrane protein 


450 


84 


615 


AC004232 


Homo sapiens 


FPM315 


163 


37 


616 


G01984 


Homo sapiens 


Human secreted protein, SEQ ID NO; 6065. 


205 


79 



117 



WO 01/57188 



PCT/USO 1/03800 



SEQ 


Accession 

"Kin 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


617 


Y91524 


Homo sat) tens 


Human secreted Drotein seauertce encoded by 
gene 74 SEQ ID NO: 197. 


821 


99 


618 


AJ245621 


Homo sad 1 ens 


CTL2 Drotein 


2258 


99 


619 


Y76I98 


Homo sapiens 


Human secreted protein encoded by gene 75. 


108 


64 


620 


AF067864 


Homo sapiens 


transferrin receptor 2 alpha 


3922 


94 


621 


D90721 


Escherichia 


Transmembrane protein dppC 


573 


90 




W7SRSR 


I1U111U iapicjio 


Human <iprrptrvrv nrntpin nf clftne OS7^7--l 


730 


100 




VQ4QR7 


Unmn Qfltlif 11"? 


Human secreted Drotein vhl2 1 SEO ID NO"4 

i Jl Lit | lull JVvl wLVU W* \J ivlJI V|/l At 1^ iJi^ \^ 11-^ il v* Ti 


733 


100 


£74 


AF0^474S 






637 


83 


625 

■ 


U42580 


Paramecium 

uui oai 10. 

Chlorella 
virus 1 


Pro-rich, IPPPNMSLPLS (3x) 


94 


46 


626 


U79260 


Homo saoiens 


unknown 


194 


70 


627 


R95913 


Homo sapiens 


Neural thread protein. 


99 


50 


628 


G03450 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7531. 


427 


100 


A7Q 


I jUiO 1 




T-Tinmafi cp^rpt^H nrntpin pnrnHpfl hv opnp 

XnUJIlull ^CvlvlvU L/IU«vllt m\/VrU&U w J cL 1 W «^ L> • 


590 


100 


630 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
rlnnp. "HTD Ar>?7 

t»iunc o A uryu^.. 


165 


76 








Human ccrtt^pA nrntpin QFO ID NO- 6770 


768 


96 






ilUinO oapiCJliy 




351 
j ^ i 


80 


633 


AF121857 


Homo sapiens 


sorting nexin 7 


2019 


100 


034 


Ar ^0 J/72 


Homo sapiens 


similar 10 riomo sapiens noosomai protein l^iu 
L25899 




77 




1 u /uyu 


T_jrt »w c 0 >■* lane 

nomo sapiens 


xvcnd.1 Lancer ubMJvidicu aniigcii piccui iur 
sequence. 


777 

£•11 


64 






xiomo sapiens 


UUoro 


414 


76 


637 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


315 


71 






Kattus 
norvegicus 


o/vt>/\ transporter 


074 




639 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


219 


60 




1 U14UU 


nomo sapiens 


occrcicu pruiem cncouca dj gene i o oiouc 


1^7 


7Q 






/XTaUmopblb 
illaiioiia 




171 






W74K74 


W/imn cPnipnc 


UttmfiTi sf*crpttv\ nrr>fWn pncofff^i hv pcnp 96 
clone HAOBK61. 


615 


62 


643 


AB015982 


Homo sapiens 


serine/threonine kinase 


485 


98 


644 


Y25806 


Homo sapiens 


Human secreted protein fragment encoded from 
eene 23 


162 


46 


645 


AF122904 


Homo sapiens 


membrane protein DAP1 0 


474 


100 


646 


AF233323 


Homo sapiens 


Pas-associatftd nhn^nhata^e- 1 


200 


38 


647 


W48804 


Homo sapiens 


Homo ^aniens clone RK158 t Drotein 


1203 


99 


648 


AF257330 


Homo sapiens 


COBW-like protein 


1440 


98 


649 


Y36203 


Homo sapiens 


Human ^pcr^tpfi nrntpin 


233 


73 


650 


G02872 j Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


173 


78 


Oj I 


Y32199 | Homo sapiens 


llMllUut TEVCpiQl luUiE\A2lG CHUAJIHJ Vj 

Inrvte rlnne 7077^7Q 


1017 


100 


AS? 


AB032909 


Hylobates 
agiiis 


/fnnf*minp rp/*j*nlrir 04 


122 


32 




AK021848 


Homo sapiens 


uiuituticu uivicui uruuucL 


186 

J ou 


69 




W73411 


Homo sapiens 


xiuiiidij acucicu pruicuj cutuueu uy vjc-nc i^u. 

15. 




37 


655 


L22455 


Rattus 
norvegicus 


mu ODinid rGceDtor 


116 


34 


656 


G03112 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7193. 


110 


45 


657 


G02345 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6426. 


459 


97 


658 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


291 


75 


659 


G02832 


Homo sapiens 


Human secreted protein, SEQ JDNO: 6913. 


134 


65 


660 


Y91423 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 11 SEQ ID NO: 144. 


333 


96 



118 



WO 01/57188 
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SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


661 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870, 


168 


68 


662 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated HSCOP-6. 


375 


43 


663 


W75771 


Homo sapiens 


Human GTP binding protein APD08. 


629 


100 


664 


AL096770 


Homo sapiens 


DA150A6.2 (novel 7 transmembrane receptor 
(rhodopsin family) (olfactory receptor like) 
protein (hs6M 1-21)) 


480 


55 


665 


AB037734 


Homo sapiens 


KIAA13 13 protein 


978 


96 


666 


W82841 


Homo sapiens 


Human cerebral protein- 1. 


192 


84 


667 


W82841 


Homo sapiens 


Human cerebral protein- 1. 


182 


87 


668 


AB030184 


Mus 

musculus 


contains transmembrane (TM) region and ATP 
binding region 


757 


68 


669 


AB032919 


Hylobates 
muelleri 


dopamine receptor EH 


85 


37 


670 


AF107295 


Rattus 
norvegicus 


outer membrane protein 


746 


81 


671 


Z33642 


Homo sapiens 


leukocyte surface protein 


394 


93 


672 


W85608 


Homo sapiens 


Secreted protein clone du410_5. 


261 


91 


673 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


106 


48 


674 


AL035587 


Homo sapiens 


dJ475N16.4 (K1AA0240) 


2388 


99 


675 


Y59668 


Homo sapiens 


Secreted protein 108-005-5-0-Cl-FL. 


1134 


53 


676 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


174 


74 


677 


AF026954 


Bos taurus 


pyruvate dehydrogenase phosphatase regulatory 
subunit precursor; PDPr 


1013 


95 


678 


LI 1625 


Mus 

musculus 


receptor protein-tyrosine kinase 


545 


96 


679 


AL031427 


Homo sapiens 


dJI67A19.3 (novel protein) 


745 


100 


680 


A Jl 33430 


Mus 

musculus 


olfactory receptor 


528 


77 


681 


G02532 


Homo sapiens 


Human secreted protein, SEQ ID NO; 6613. 


179 


70 


682 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


336 


76 


683 


Y94943 


Homo sapiens 


Human secreted protein clone ytl 4__1 protein 
sequence SEQ ID N0.92. 


118 


100 


684 


U43360 


Peromyscus 
maniculatus 


reverse transcriptase 


100 


37 


685 


G00885 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4966. 


162 


60 


686 


AK001518 


Homo sapiens 


unnamed protein product 


590 


100 


687 


G01982 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6063. 


718 


100 


688 


Y92241 


Homo sapiens 


Human cancer associated antigen precursor 
(MO-REN-46). 


2405 


99 


689 


AC024792 


Caenorhabditi 
s elegans 


contains similarity to TR.P78316 


423 


36 


690 


Y27868 


Homo sapiens 


Human secreted protein encoded by gene No. 
107. 


183 


81 


691 


Y56514 


Homo sapiens 


Human Jurkat cell clone P2-15 AIM10 longest 
ORF protein sequence. 


180 


88 


692 


Y27795 


Homo sapiens 


Human secreted protein encoded by gene No. 79. 


1539 


99 


693 


Y36268 


Homo sapiens 


Human secreted protein encoded by gene 45. 


428 


98 


694 


U 12465 


Homo sapiens 


ribosomal protein L35 


308 


89 


695 


Y45272 


Homo sapiens 


Human secreted protein encoded from gene 1 6. 


1517 


99 


696 


AF191838 


Homo sapiens 


TANK binding kinase 1BK1 


1242 


98 


697 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD2Z 


275 


75 


698 


Y87280 


Homo sapiens 


Human signal peptide containing protein HSPP- 
57 SEQ 1DN0:57. 


576 


90 


699 


Y97999 


Homo sapiens 


Human SCAD family molecule HSFM-1, SEQ 
IDN0:1. 


729 


99 


700 


AJ006701 


Homo sapiens 


putative serine/threonine protein kinase 


610 


79 


701 


AF209198 


Homo sapiens 


zinc finger protein 277 


2357 


100 


702 


AJ298841 


Mus 

musculus 


torsinA protein 


709 


45 


703 


AK021729 


Homo sapiens 


unnamed protein product 


622 


98 


704 


Z46787 


Caenorhabditi 
s elegans 


similar to Glutaredoxin, Zinc finger, C3HC4 
type (RING finger) 


920 


51 


705 


G02882 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6963. 


589 


98 



119 



WO 01/57188 PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 

Waterman 

Score 


% 

Identity 


706 


_*""1 A*\*^ _j- f\ 4 

G02501 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6582, 


IOC 

125 


CO 

58 


707 


R95326 


Homo sapiens 


Tumor necrosis factor receptor 1 death domain 
ligand (clone 2DD). 


121 


95 


708 


G03002 


Homo sapiens 


Human secreted protem, SEQ ID NO: 7083. 


125 


1 A 

39 


' — — — 

709 


Y96202 


Homo sapiens 


IkappaB kinase (IKK) binding protein, Y2H56. 


516 


AO 

98 


710 


M63577 


Saccharomyc 
es cerevisiae 


SFP1 


131 


59 


71 1 


AB02629I 


Rattus 
norvegicus 


— _—_ ___ A _____ _a ___ ___ J_r _ J ¥ __T " T __ J_ __ __ .^aL _— _» ___ M _____ 

acetoacetyl-CoA synthetase 


40/ 


85 


712 


D2I2U 


Homo sapiens 


protein tyrosine phosphatase (PTP-BAS, type 3) 


368 


44 


713 


AF 04403 3 


Marmota 
marmota 


olfactory receptor 


f 1 c 

015 


83 


714 


G03561 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7642. 


251 


100 


715 


AB033062 


Homo sapiens 


KIAA1236 protein 


1 O OA 

1380 


1 AA 

100 


716 


G00577 


Homo sapiens 


Yt — - ^ ^ a 4-1 i * f>p __T\ TT"\ \T_^ 4 jf f O 

Human secreted protein, bEQ ID NO: 4658. 


OA 

80 


73 


717 


Y96864 


Homo sapiens 


SEQ. ID. 37 from WO0034474. 


835 


99 


718 


AJ243396 


Homo sapiens 


voltage-gated sodium channel beta-3 subunit 


234 


100 


719 


U47334 


Homo sapiens 


similar to chicken gamma aminobutyric acid 
receptor beta4 subunit 


578 


99 


720 


AB020598 


Homo sapiens 


peptide transporter 3 


1096 


100 


72 1 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated HSCOP-6. 


570 


74 


722 


J05046 


Homo sapiens 


insulin receptor-related receptor 


6787 


100 


723 


AF001958 


Ambystoma 
tigrinum 


electrogenic Na+ bicarbonate cotransporter, 
NBC 


111 


41 


724 


AF 127084 


Mus 

musculus 


semaphorin cytoplasmic domain-associated 
protein 3 A 


5253 


— sri 

94 


725 


X54673 


Homo sapiens 


GABA transporter 


3114 


99 


726 


AF016191 


Rattus 
norvegicus 


potassium channel 


370 


100 


727 


AB029559 


Rattus 
norvegicus 


BAT1 


139 


35 


728 


Y28503 


Homo sapiens 


HGFH3 Human Growth Factor Homologue 3. 


2186 


97 


729 


AJ011415 


Homo sapiens 


plexin-Bl/SEP receptor 


729 


56 


730 


Z93096 


Homo sapiens 


bK390B3.1 (manic fringe (Drosophila) 
homolog) 


142 


68 


73 1 


Z10062 


Homo sapiens 


cDNA encoding a human vanilloid receptor 
homologue Vanilrepl . 


675 


99 


732 


AF161382 


Homo sapiens 


HSPC264 


492 


94 


733 


AB029033 


Homo sapiens 


K1AA11 10 protein 


3826 


99 


734 


AE000493 


Escherichia 
coli 


putative transport protein 


592 


97 


735 


AL033379 


Homo sapiens 


dJ4 17022.2 (novel 7 transmembrane receptor 
(rhodopsin family) protein similar to high- 
affinity lysophosphatidic acid receptor homolog) 


2173 


99 


736 


AF 132599 


T ¥ ■ 

Homo sapiens 


T"v A Y' t *■ * O -C* . __ .____* 1 . . • _ . 1 _ ___ t __.___.__ 

RANTES factor of late activated T lymphocytes- 
1 


245 


56 


737 


VCCA1 A 

X55019 


Homo sapiens 


acetylcholine receptor delta subunit 


GOI 

883 


AA 

99 


738 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


1978 


100 


739 


AB0261 16 


Homo sapiens 


. ___ . ___. a 

organic anion transporter 4 


1 A A A 

1444 


AO 

98 


740 


D00570 


Mus 

musculus 


open reading frame (196 AA) 


83 


24 


741 


W03626 


"IT * 

Homo sapiens 


Human thyrotropin GPRN-termmal sequence. 


IIS 


40 


742 


U66059 


Homo sapiens 


V_segmem translation product 


614 


100 


743 


AF119815 


Homo sapiens 


G-protein-coupled receptor 


275 1 


99 


744 


XI 6663 


Homo sapiens 


haematopoietic lmeage cell protein (AA 1-486) 


148 


93 


745 


W67838 


f IT * 

Homo sapiens 


Human secreted protem encoded by gene 32 
clone HLTCJ63. 


448 


_n __v 

95 


746 


W57260 


Homo sapiens 


Human semaphorin Y. 


2414 


100 


747 


W21578 


Homo sapiens 


Alzheimer's disease protein encoded by DNA 
from piasmid pGCS2232. 


968 


65 


748 


Y94935 


Homo sapiens 


Human secreted protein clone yd218_l protein 
sequence SEQ ID NO:76. 


622 


100 


749 


AL022238 


Homo sapiens 


dJ1042K10.5 (novel protein) 


314 


85 


750 


G03889 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7970. 


391 


87 



120 



WO 01/57188 



PCT/US01/03800 



SEQ 

ID 

NO: 


Accession 
Mo. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


751 


AB025258 


Mus 

musculus 


granuphilin-a 


773 


41 


752 


Y52386 


Homo sapiens 


Human transmembrane protein HP02000. 


900 


99 


753 


V48586 


Homo sapiens 


Human breast tumour-associated protein 47. 


2527 


99 


754 


AJ272207 


Homo sapiens 


putative G protein-coupled receptor 92 


694 


100 


755 


M85183 


Rattus 
norvegicus 


vasopressin receptor 


979 


68 


756 


AF190501 


Homo sapiens 

• 


leucine-rich repeat-containing G protein-coupled 
receptor 6 


388 


71 


757 


Y02692 


Homo sapiens 


Human secreted protein encoded by gene 43 
cloneHTADX17. 


461 


87 


758 


Z22535 


Homo sapiens 


ALK-3 


439 


98 


759 


R04932 


Homo sapiens 


Interferon-gamma receptor segment from clone 
39 responsiblefor binding the target. 


564 


97 


760 


W74902 


Homo sapiens 


Human secreted protein encoded by gene 175 
clone HE8BI9Z 


1217 


99 


761 


G03706 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7787. 


223 


88 


762 


AB020676 


Homo sapiens 


KIAA0869 protein 


4433 


99 


763 


AK026992 


Homo sapiens 


unnamed protein product 


2285 


99 


764 


AF173358 


Homo sapiens 


glucocorticoid receptor AF-l coactivator-I 


573 


100 


765 


AF268066 


Mus 

musculus 


netrin 4 


2019 


89 


766 


Y48585 


Homo sapiens 


Human breast tumour-associated protein 46. 


1169 


89 


767 


AF230378 


Mus 

musculus 


intcrleukin-1 delta 


309 

■mJ V -X 


45 


768 


AF121975 


Mus 

musculus 


odorant receptor S 1 8 


268 


62 


769 


AB008515 


Homo sapiens 


RanBPM 


611 


57 


770 


Y09945 


Rattus 
norvegicus 


putative integral membrane transport protein 


458 


50 


771 


AF226731 


Homo sapiens 


AD026 


688 


99 


772 


Y27132 


Homo sapiens 


Human glioblastoma-derived polypeptide (clone 
OA004FG). 


1384 


100 


773 


X87832 


Homo sapiens 

* 


NOV7plexin-Al protein 

* * 


1821 


98 


774 


AB025258 


Mus 

musculus 


granuphilin-a 


500 


41 


775 


AF125101 


Homo sapiens 


HSPC040 protein 


232 


93 


776 


G02815 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6896. 


314 


95 


777 


G02493 


Homo sapiens 


Human secreted protein, SEQ ED NO; 6574. 


191 


68 


778 


R03301 


Homo sapiens 


Sequence of pre-human atrial natriuretic peptide. 


213 


45 


779 


AL357374 


Homo sapiens 


bA353C18,2 (novel protein) 


232 


100 


780 


AF100346 


Homo sapiens 


neuronal voltage gated calcium channel gamma- 
3 subunit 


1434 


89 


781 


Y19566 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


103 


52 


782 


Y36233 


Homo sapiens 


Human secreted protein encoded by gene 10. 


1098 


93 


783 


AF084464 


Rattus 
norvegicus 


GTP-binding protein REM2 


141 


30 


784 


W49042 


Homo sapiens 


Human low density lipoprotein binding protein 
LBP-3. 


2693 


— — — — 
99 


785 


AF238381 


Homo sapiens 


PTOV1 


1904 


91 


786 


Y91870 


Homo sapiens 


Human apoptosis related protein. 


547 


100 


787 


Y71062 


Homo sapiens 


Human membrane transport protein, MTRP-7. 


1062 


94 


788 


AF1 17754 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP240 


8684 


98 


789 


AL049569 


Homo sapiens 


dJ37C10.3 (novel ATPase) 


2848 


96 


790 


API 5 1848 


Homo sapiens 


CGI-90 protein 


745 


96 


791 


Y08639 


Homo sapiens 


nuclear orphan receptor ROR-beta 


1421 


95 


792 


Y41706 


Homo sapiens 


Human PR038I protein sequence. 


644 


99 


793 


AF121228 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP95 


1037 


100 


794 


G04072 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8153. 


124 


62 


795 


Y69384 


Homo sapiens 


Amino acid sequence of a 14274 receptor 
protein. 


119 


100 


796 


W40215 


Homo sapiens 


Human macrophage antigen. 


1358 


99 



121 



WO 01/57188 



PCT/US01/03800 



Accession 
No. 



AF258340 



AF 15961 5 



Y59863 



Species 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Description 



hepatocellular carcinoma-associated antigen 112 



FGF receptor activating protein 1 



Human normal uterus tissue derived protein 26. 



Smith- 
Waterman 
Score 



1151 



461 



797 
572 



% 



Identity 



99 



98 



99 
92 



W70459 



Homo sapiens 



Human Tl -receptor ligand in splice variant 2. 



L00073 



Homo sapiens 



renin 



1913 



P92219 



Homo sapiens 
(human) 



CR1 protein. 



11963 



X15357 



Homo sapiens 



ANP-A receptor preprotein (AA -32 to 1029) 



5199 
4018 



93 



97 



21. 
95 



W64473 



Homo sapiens 



Human secreted protein from clone EC172_1. 



AJ243874 



Homo sapiens 



oligophrenin-4 



2067 



G01731 



Homo sapiens 



Human secreted protein, SEQ IP NO: 5812. 



284 



Z24680 



Homo sapiens 



garp 



1562 



AF171669 



Homo sapiens 



glycoprotein-associated amino acid transporter 
LAT2 



1364 



W70321 



Homo sapiens 



Secreted protein CC198J. 



W74843 



Homo sapiens 



AF 108831 



Homo sapiens 



AF092135 



Homo sapiens 



Human secreted protein encoded by gene 1 1 5 

clone HOVBA03. 

K:C1 cotransporter 3 



1154 



855 



4561 



PTD014 



862 



AF283772 



Homo sapiens 



similar to Homo sapiens ribosomal protein L10 
encoded by GenBank Accession Number 
L25899 



784 



GO 1563 



Homo sapiens 



Human secreted protein, SEQ ID NO: 5644. 



330 



AF051151 



Homo sapiens 



Toll/interleukin-1 receptor-like protein 3 



W95630 



Homo sapiens 



Homo sapiens secreted protein gene clone 



gn!14_l 



G01082 



Homo sapiens 



Human secreted protein, SEQ ID NO: 5163. 



AF151800 



Homo sapiens 



L00352 



Homo sapiens 



CGI-41 protein 

low density lipoprotein receptoF 



X04434 



G03844 



AF212220 



Y50125 



AF156778 



AF096322 



Y07972 



Homo sapiens 



IGF-I receptor 



Homo sapiens 



Human secreted protein, SEQ ID NO: 7925. 



Homo sapiens 



TERA 



Homo sapiens 



Human glycophosphatidylinositol-anchored 
protein GPI-122. 



Homo sapiens 



ASB-3 protein 



Homo sapiens 



neuronal voltage-gated calcium channel gamma- 
2 subunit 



Homo sapiens 



AB032013 Homo sapiens 



Y 13620 



Homo sapiens 



Y91474 



Homo sapiens 



X54232 



Homo sapiens 



X14830 



Homo sapiens 



Human secreted protein fragment #2 encoded 
from gene 28. 



potassium channel Ky8.1 



BCL9 



Human secreted protein sequence encoded by 
gene 24 SEQ IDNO:147. 



glypican 



acetylcholine receptor beta-subunit preprotein 



3850 



358 



549 



1106 



3980 



5832 



572 



396 



4897 



2675 



1105 



1540 



2435 



5284 



541 



1625 



2540 



100 



100 



83 



90 



96 



99 



100 



100 



100 



100 



99 



100 



100 



95 



100 



99 



100 



48 

99 



98 



100 



100 



95 
96 



98 



87 



100 



Y71262 



Homo sapiens 



Human chondromodulin-like protein, Zchml 



1002 



G03873 



AC003030 



Y38422 



U41557 



AL121889 



Homo sapiens 



Human secreted protein, SEQ ID NO: 7954 



638 



Homo sapiens 



R29828 1 



1389 



Homo sapiens 



Human secreted protein. 



964 



Caenorhabditi 
s elegans 



glycine-rich 



85 



Homo sapiens 



dJ1076E17.I (KIAA0823 protein (continues in 
AL023803)) 



998 



AJ011415 



W80398 



G00862 



G02650 



AF036717 



Y73446 



G02872 



AF151810 



X83378 



AC004883 



Homo sapiens 



plexin-BVSEP receptor 



1580 



Homo sapiens 



A secreted protein encoded by clone cw!543_3. 



1105 



Homo sapiens 



Human secreted protein, SEQ ID NO: 4943. 



255 



Homo sapiens 



Human secreted protein, SEQ ID NO: 6731 



644 



Homo sapiens 



FGFR signalling adaptor SNT-1 



2629 



Homo sapiens 



Human secreted protein clone yc27_l protein 
sequence SEQ IDNQ-.114. 



1089 



Homo sapiens 
Homo sapiens 



Human secreted protein, SEQ ID NO: 6953. 



357 



CGI-52 protein 



1443 



Homo sapiens 



putative chloride channel 



1620 



Homo sapiens 



similar to general transcription factor 21; similar 



655 



100 



96 



93 



87 
36 



75 



60 



67 



92 



97 



99 



100 



69 



99 



96 



122 



WO 01/57188 



PCT/USO 1/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 








to AF038969 (PlD:g2827207) 






848 


X99886 


Homo sapiens 


monocyte chemotactic protein-2 


160 


76 


849 


AC005587 


Homo sapiens 


similar to mouse olfactory receptor 13; similar to 
P34984 (PID:g464305) 


963 


98 


850 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 


85 1 


AF 124490 


Homo sapiens 


ARF GTPase-activating protein GIT1 


3415 


98 


852 


Y86217 


Homo sapiens 


Human secreted protein HWHGU54, SEQ ID 
NO; 132, 


1189 


99 


853 


AF224741 


Homo sapiens 


chloride channel protein 7 


3748 


99 


854 


X17094 


Homo sapiens 


furin(AA 1-794) 


3550 


99 


855 


W78245 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 19. 


1245 


99 


856 


R97569 


Homo sapiens 


lnterleukin-2 receptor associ ated protein p43 . 


1926 


100 


857 


Y41765 


Homo sapiens 


Human PRO 1083 protein sequence. 


3211 


99 


858 


AF057306 


Homo sapiens 


transmembrane proteolipid 


481 


84 


859 


AK025116 


Homo sapiens 


unnamed protein product 


374 


69 


860 


Y41312 


Homo sapiens 


Human secreted protein encoded by gene 5 clone 
HLDRM43. 


824 


100 


862 


Y25776 


Homo sapiens 


Human secreted protein encoded from gene 66. 


895 


99 


863 


Y74188 


Homo sapiens 


Human prostate tumor EST fragment derived 
protein #375. 


96 


30 


864 


AF 167473 


Homo sapiens 


heme-binding protein 


870 


99 


865 


G02532 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6613. 


211 


67 


866 


X54870 


Homo sapiens 


Type 11 integral membrane protein 


1201 


100 


867 


G00700 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4781. 


640 


99 


868 


Y07894 


Homo sapiens 


Human secreted protein fragment encoded from 
gene 43. 


388 


88 


869 


J00123 


Homo sapiens 


preproenkephalin ( 


1349 


95 


870 


Y91632 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 25 SEQ1DNO:305. 


1048 


98 


871 


L043I1 


Homo sapiens 


GABA-alpha receptor beta-3 subunit 


237 


93 


872 


Y29988 


Homo sapiens 


Human cytokine family member EF-7 protein. 


960 


94 


873 


AF161382 


Homo sapiens 


HSPC264 


1124 


99 


874 


G03412 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7493. 


464 


100 


875 


Y27572 


Homo sapiens 


Human secreted protein encoded by gene No. 6. 


573 


96 


876 


Ml 5530 


Homo sapiens 


B-ceB growth factor 


171 


56 


877 


W63681 


'Homo sapiens 


Human secreted protein I . 


1652 


99 


878 


L27867 


Rattus 
norvegicus 


neurexophilin 


1448 


98 


879 


Y10835 

■ 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


321 


100 


880 


W88991 


Homo sapiens 


Polypeptide fragment encoded by gene 144. 


936 


100 


881 


AF1 18670 


Homo sapiens 


orphan G protein-coupled receptor 


1971 


100 


882 


AF208865 


Homo sapiens 


EDRF 


528 


100 


883 


Y18462 


Homo sapiens 


cathepsin L 


209 


72 


884 


Y94950 


Homo sapiens 


Human secreted protein clone dh 1073_12 protein 
sequence SEQ ID NO: 106. 


348 


100 

- 


885 


AF070661 


Homo sapiens 


HSPC005 


404 


100 


886 


Y04315 


Homo sapiens 


Human secreted protein encoded by gene 23. 


385 


100 


887 


X92744 


Homo sapiens 


hBD-1 


375 


100 


888 


Y22496 


Homo sapiens 


Human secreted protein sequence clone 
cn621_8. 


994 


94 


889 


Y41293 


Homo sapiens 


Human soluble protein ZTMPO-1 . 


4595 


99 


890 


G03714 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7795. 


147 


63 


891 


AF208856 


Homo sapiens 


BM-014 


1012 


99 


892 


U29195 


Homo sapiens 


neuronal pentraxin II 


2002 


98 


893 


X68149 


Homo sapiens 


Burkitt lymphoma receptor 1 


1953 


100 


894 


Y94914 


Homo sapiens 


Human secreted protein clone pw337 6 protein 
sequence SEQ ID NO:34. 


537 


100 


895 


W61630 


Homo sapiens 


Clone HNFGW06 of EGFR receptor family. 


326 


63 


896 


M24110 


Homo sapiens 


G0S19-2 peptide precursor 


481 


100 


897 


Z68747 


Homo sapiens 


imogen 38 


2018 


99 


898 


AF186112 


Homo sapiens 


neurokinin B-like protein ZNEUROK 1 


619 


100 


899 


AF225420 


Homo sapiens 


AD025 | 734 


100 



123 



WO 01/57188 
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SEQ 

ID 

NO: 



900 



901 



902 



903 
904 



905 



906 



907 



908 



909 



910 



911 



912 



913 



914 
915 



916 



917 



918 



919 



920 



921 



922 
923 



924 



925 



926 



927 
928 



929 



930 



931 



932 



933 



934 



Accession 
No. 



P60657 



M27288 



W85737 



G01349 



Y00261 



AF039688 



AB007836 



ABO 17507 



AK000056 



Y86299 



AF231023 



Y14134 



Species 



Description 



Homo sapiens | Sequence of human I ipocortm. 



Homo sapiens | oncostatin M 



Homo sapiens | Polypeptide with transmembrane domain 



Homo sapiens 1 Human secreted protein, SEQ ID NO: 5430.T 



Homo sapiens | Human secreted protein encoded by gene 4. 



Homo sapiens | antigen NY -CO-3 



Homo sapiens | Hic-5 



Homo sapiens | Apgl2 



Homo sapiens | unnamed protein product 



Homo sapiens 



Human secreted protein HFOXB55, SEQ ID 
NO:214. 



Homo sapiens [ protocadherin Flamingo 1 



Z90420 



Y19757 



G03172 



U14971 



AF 172854 



AC005525 



AF166350 



Y87285 



Y36131 



AF 193766 
Y95013 



X75208 



Y96202 



AB039886 



G03368 



Y48606 



Y36151 



AF 110399 



AF210317 



Y73328 



Homo sapiens 



Vascular endothelial cell growth inhibitor beta 
protein sequence. 



Homo sapiens Human GDF-3 (hGDF-3) polypeptide encoding 



cDNA, 



Homo sapiens | SEQ ID NO 475 from WQ9922243. 



Homo sapiens | Human secreted protein, SEQ ID NO: 7253. 



Homo sapiens | ribosomal protein S9 



Homo sapiens | card iotroph in-like cytokine CLC 



Homo sapiens | F22162_l 



Homo sapiens | ST7 protein 



Homo sapiens 



Human signal peptide containing protein HSPP- 
62 SEQ IDN0.62. 



Homo sapiens [ Human secreted protein #3. 



Homo sapiens cytokin e-like protein C 17 



Homo sapiens 1 Human secreted protein vc48j, SEQ ID NO:66. 



Homo sapiens | protein tyrosine kinase-receptor 



Homo sapiens ] IkappaB kinase (IKK) binding protein, Y2H56. 



Homo sapiens | down-regulated in gastric cancer 



Homo sapiens | Human secreted protein, SEQ ID NO: 7449. 



Homo sapiens [ Human breast tumour-associated protein 67. 



Homo sapiens ] Human secreted protein #23. 



Homo sapiens \ elongation factor Ts 



Homo sapiens 



facilitative glucose transporter family member 
GLUT9 



GO 1959 



U47924 



G03827 



Homo sapiens [ HTRM clone 082843protein sequence 



Homo sapiens | Human secreted protein, SEQ ID NO: 6040. 



Homo sapiens | B-cell receptor associated protein 



Homo sapiens | Human secreted protein, SEQ ID NO: 7908. 



Smith- 
Waterman 
Score 



1835 



1297 



749 



650 



133 



771 



2544 



224 



1537 



427 



7393 



1319 



1950 



1361 



112 



886 



1204 



1963 



4711 



430 



465 



724 



357 



5256 



813 



785 



55 



539 



668 



1666 



2763 



931 
274 



1469 



529 
196 



% 



Identity 



100 



99 
100 



99 



99 
99 



100 



100 



98 



100 



99 



100 



100 



100 



48 



90 



99 



100 
99 



100 



88 



100 



100 



100 
98 



78 
50 



100 



100 
100 



99 



100 
100 



100 



93_ 
63 



935 



936 



AB039371 



Homo sapiens | mitochondrial ABC transporter 3 



X56385 



Canis 
familiaris 



rab8 



1064 



100 



937 



938 
939 



940 



B08906 



Homo sapiens 



Human secreted protein sequence encoded by 
gene 16 SEQ JDNO:63. 



117 



M13692 



Homo sapiens | alpha- 1 acid glycoprotein precursor 



1064 



Y53886 



Homo sapiens 



A suppressor of cytokine signalling protein 
designated HSCOP-6. 



515 



Y 16630 



Homo sapiens 



Human Putative Adrenomedullin Receptor 
(PAR). 



1904 



44 



99 



42 



99 
"99 



941 



942 



943 



944 



945 



946 



947 



948 



AC005102 



Homo sapiens 



small inducible cytokine subfamily A member 
24 



627 



MI 2886 



Homo sapiens 1 T-cell receptor beta chain 



1289 



AF226046 



Homo sapiens [ GK003 



1049 



Y36078 



Homo sapiens 



Extended human secreted protein sequence, SEQ 
ID NO. 463. 



667 



M22877 



Homo sapiens cytochrome c 



565 



W67869 



Homo sapiens | Human secreted protein encoded by gene 63 
clone HHGDB72. 



551 



W67859 



W85726 



Homo sapiens 



Human secreted protein encoded by gene 53 
clone HBMCL4 1 . 



283 



Homo sapiens | Novel protein (Clone BG33J7). 



789 
4236 



81 



98 



100 



100 



93 



100 



100 
100 



949 



950 



AJ242015 



Homo sapiens I eMDC II protein 



G04075 



Homo sapiens 1 Human secreted protein, SEQIDNO: 8156. 



567 



99 



124 



WO 01/57188 



PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


951 


AF1 10645 


Homo sapiens 


candidate tumor suppressor p33 ING1 homolog 


1314 


100 


952 


Y361 11 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 496. 


402 


70 


953 


AB012109 


Homo sapiens 


APC10 


990 


100 


954 


AF246221 


Homo sapiens 


transmembrane protein BR1 


1405 


100 


955 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1883 


100 


956 


W74726 


Homo sapiens 


Human secreted protein fg949_3. 


1879 


100 


957 


Y27096 


Homo sapiens 


Human viral receptor protein (ACVRP). 


1581 


100 


958 


AJ222967 


Homo sapiens 


cystinosin 


1920 


100 


959 


Y53052 


Homo sapiens 


Human secreted protein clone df2(X2_3 protein 
sequence SEQ ID NO: 1 1 0. 


587 


100 


960 


G02694 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6775. 


283 


100 


961 


AF151855 


Homo sapiens 


CGI-97 protein 


1214 


96 


962 


U26592 


Homo sapiens 


diabetes mellitus type I autoantigen 


250 


65 


963 


AL050306 


Homo sapiens 


dJ475B7.2 (novel protein) 


3796 


100 


964 


AF078859 


Homo sapiens 


PTO004 


2089 


100 


965 


AB020315 


Homo sapiens 


homologue of mouse dkk-1 gene:Acc# 
AF030433 


1466 


100 


966 


X04571 


Homo sapiens 


precursor polypeptide (AA -22 to 1 185) 


6580 


99 


967 


AF146019 


Homo sapiens 


hepatocellular carcinoma antigen gene 520 


993 


99 


968 


AF071002 


Homo sapiens 


minK-related peptide I; MiRPl 


632 


100 


969 


AB021227 


Homo sapiens 


membrane-type-5 matrix metalloproteinase 


3545 


100 


970 


AF180920 


Homo sapiens 


cyclin L ania-6a 


1579 


100 


971 


AF105365 


Homo sapiens 


K-Cl cotransporter KCC4 


5621 


99 


972 


AF083248 


Homo sapiens 


ribosoma] protein L26 homolog 


739 


100 


973 


AJ132429 


Homo sapiens 


hyperpolarization-activated cyclic nucleotide 
gated cation channel hHCN4 


6295 


100 


974 


W61619 


Homo sapiens 


Clone HTPEF86 of TM4SF superfamily. 


454 


100 


975 


AF155100 


Homo sapiens 


zinc finger protein NY-REN-21 antigen 


2261 


100 


976 


AF275948 


Homo sapiens 


ABCA1 


11763 


99 


977 


AB026891 


Homo sapiens 


cystine/glutamate transporter 


2552 


100 


978 


AF 11 7657 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP80 


3348 


99 


979 


AF044201 


Rattus 
norvegicus 


neural membrane protein 35; NMP35 


1570 


92 


980 


AF1 19297 


Homo sapiens 


neuroendoenne-specific protein-like protein 1 


1170 


99 


981 


AF155652 


Homo sapiens 


potassium channel modulatory factor 


1983 


99 


982 


W88499 


Homo sapiens 


Human stomach carcinoma clone HP 104 12- 
encoded protein. 


1553 


99 


983 


Z56281 


Homo sapiens 


interferon regulatory factor 3 


2012 


98 


984 


AB026125 


Homo sapiens 


ART-4 


2160 


100 


985 


Y14482 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 17. 


172 


70 


986 


AB023888 


Homo sapiens 


b-chemokine receptor CCR4 


1895 


100 


987 


W27291 


Homo sapiens 


Human H1075-1 secreted protein 5* end. 


712 


100 


988 


AF153450 


Man due a 
sexta 


juvenile hormone esterase binding protein 


226 


32 


989 


G03697 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7778. 


194 


88 


990 


AF204I59 


Homo sapiens 


potassium large conductance calcium-activated 
channel beta 3a subunit 


1486 


100 


991 


G02061 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6142. 


558 


99 


992 


AL031266 


Caenorhabditi 
s elegans 


VM106R.1 


327 


40 


993 


Y66749 


Homo sapiens 


Membrane-bound protein PRO 1 124. 


4730 


99 


994 


G01246 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5327. 


141 


77 


995 


AF133845 


Homo sapiens 


corin 


5811 


99 


996 


AFl 17756 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP 1 50 


4999 


100 


997 


W62066 


Homo sapiens 


Human stem cell antigen 2. 


284 


93 


998 


Y87173 


Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:212. 


725 


100 


999 


Y13379 


Homo sapiens 


Amino acid sequence of protein PR0263. 


1654 


99 


1000 


Y95008 


Homo sapiens 


Human secreted protein vftj, SEQ ID NO: 56. 


676 


47 


1001 


AF190167 


Homo sapiens 


membrane associated protein SLP-2 


1747 


100 



125 



WO 01/57188 



PCT/US01/03800 



SEQ 

ID 

NO: 


Accession 
No. 


Species 


Description 


Waterman 

OvviC 


0/ 

Identity 


1002 


GO 1234 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5315. 


398 


96 


1003 


tl 1*71 4 OA 

W73420 


Homo sapiens 


Human secreieo protein encoueo oy vjene no. 
24. 




ion 


1004 


A 12791 


Homo sapiens 


1 Q1j.T*\ CDD nrntam t A A 1 1 /Ll\ 

1 yi<JJ oKr-protein \j\J\ \ - i ^ ) 




1 AA 
1 VA' 


1 AA<r 

1005 


M23323 


Homo sapiens 


membrane protein 




1 AA 


1006 


X63743 


Homo sapiens 


KiJnL recepior 






1 AA"* ' 

1007 


Y35997 


Homo sapiens 


Jbxtenoeo numan secreiea proiein sequence, ony 

7Pi virv too 
11J INVJ. 3o^. 






1008 


AB032918 


Hylobates 

■ ■i mf\ 1 Ann 

moiocn 


dopamine receptor D4 


92 


35 


1 /inn 
lUUV 


Yylbov 


Homo sapiens 


riuman secreieo proiein secjuence encoueu uy 
gene oi oIjv a* i^vjjj. 




oo 
sy 


1U1U 


Art t< 


riomo Sapiens 


ujjutDit.i ^nuvci pruicui^ 






mil 
lUi 1 


VJU3 /33 


riomo sapiens 


Unman o/»r-rptf»H nrntf>in QFO NO 1 7814 


379 


Qa 




Y I fz>J I 


riomo sapiens 


numan secreieo proiciii cionc dl^uj ih piutcni. 


c i e 

OlU 


07 


1013 


UUU724 


Homo sapiens 


Unman can^tB^ nmtain QT^ /"i fPl "NJ/~^ • 4RA^ 

Human sccreieu protein, oDy il/ i^^j. hov/j. 




1 AA 

I W 


1 A1 /I 
1U14 




Naegjena 
gruberi 


naem lyase 


1 14 


j / 


1015 




Homo sapiens 


Mo3 proiein 


JOO / 


yy 


1016 


X 1 594U 


Homo sapiens 


noosomaj protein i_.3i i-izd) 




l AA 


1017 


Y 94873 


Homo sapiens 


Human protein cione nruzo^z. 


1 fn& 


inn 


1 A 1 o 

1018 


AL024498 


Homo sapiens 


uJ4i /ml 4.1 ^novei protein j 




1 Aft 


1019 


X83425 


Homo sapiens 


Lutheran blood group glycoprotein 


3054 


99 


1020 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 




inn 

IUU 


1021 


G03960 


Homo sapiens 


Human secreted protein, MaJ 1U mu. oU4i. 


3"o 


1 AA 
IUU 


1022 


Y91689 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 93 obvj ID MO:3o2, 


ICQ 

76o 


1 AA 
IUU 


1023 


AE000660 


Homo sapiens 


nAJjV36!Sl 


J /J 


inn 

IUU 


1024 


Ar 132965 


Homo sapiens 


Cvjl-3 1 protein 


i j ju 


inn 

IUU 


1025 


W92380 


Homo sapiens 


Human TR-interacting protein SI 03a. 


14O0 


G7 


1026 


R66278 


Homo sapiens 


Therapeutic polypeptide from glioblastoma cell 

i • 

line. 


830 


100 


1027 


X65614 


Homo sapiens 


SI OOP calcium-binding protein 


476 


100 


1028 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


13/3 


i An 
IUU 


1029 


AJ001014 


Homo sapiens 


RAMP1 


806 


100 


1030 ' 


W63682 


Homo sapiens 


Human secreted protein 2. 


1354 


OA 

99 


1031 


AK023007 


it * _. 

Homo sapiens 


unnamed protein product 


766 


t AA 
IUU 


1032 


W97900 


Homo sapiens 


Human SR-BI class B scavenger. 


26/2 


no 
9V 


1033 


Y82453 


Homo sapiens 


Human TGC-440 secretory protein ScC^ ID 

XT/%. 1 

NO:l. 


63y 


yy 


1034 


Y73473 


Homo sapiens 


Human secreted protein clone yd 17 81 protein 
sequence oc.^ lu inu.Ioo. 


752 


93 


1035 


Y86468 


Homo sapiens 


Human gene 48-cncodcd protein fragment, SEQ 

LU JN\J.35J, 


yo 


on 
vu 


1U36 


uuy©i3 


Homo sapiens 


miiocnonunai Air synxnase suuunn y precursor 






J 037 


AJz4z8J2 


Homo sapiens 


caipam 




00 


1038 


X66403 


Homo sapiens 


acetylcholine receptor epsilon subunit CHRNE 


2574 


100 


1039 


AJ242730 


Homo sapiens 


polyhomeotic 2 


1310 


100 


1040 


AF 169968 


Mus 

musculus 


UNA binding protein UhaKI 


14<7 


CA 


1041 


X52563 


Bos taurus 


permability increasing protein 


383 


29 


1042 


G00368 


Homo sapiens 


Human secreted protein, bfcQ ID NO. 4449. 


/ J 


5U 


1043 


G02532 


Homo sapiens 


Human secretea protein, SbQ ID NU: 6613. 


6U 


33 


1044 


M94582 


Homo sapiens 


interleukin 8 receptor B 


1850 


100 


1045 


AL080239 


Homo sapiens 


bG256022.1 (similar to IGFALS (insulin-like 
growxn lacior Dinoing proiein, acio laoiie 
subunit)) 


1704 


e A 

50 


1046 


AF125101 


Homo sapiens 


HSPC040 protein 


580 


100 


1047 


W74809 


Homo sapiens 


Human secreted protein encoded by gene 81 
clone HMWDN32. 


176 


100 


1048 


AL022238 


Homo sapiens 


dJ1042K!0.4 (novel protein) 


2201 


100 


1049 


W88667 


Homo sapiens 


Secreted protein encoded by gene 134 clone 
HAIBP89. 


1559 


99 


1050 


AF097518 


Homo sapiens 


liver-specific transporter 


2820 


100 



126 



WO 01/57188 PCT/US01/03800 



SEQ 


Accession 


Species 


Description 


Smith- 


% 


ID 


No. 






Waterman 


Identity 


NO: 








Score 




1051 


W78324 


Homo sapiens 


Fragment of human secreted protein encoded by 


1318 


98 








gene 81. 






1052 


Y21851 


Homo sapiens 


Human signal peptide-contianing protein (SIGP) 


1643 


95 








(clone ID 2328134). 






1053 


AL163815 


Arabidopsis 


putative protein 


661 


62 






thaliana 








1054 


Y76200 


Homo sapiens 


Human secreted protein encoded by gene 77. 


262 


100 


1055 


AJ276567 


Homo sapiens 


TClO-likeRhoGTPase 


1160 


100 


1056 


Y27620 


Homo sapiens 


Human secreted protein encoded by gene No. 54. 


154 


96 


1057 


D14530 


Homo sapiens 


ribosomal protein 


745 


100 


1058 


AFI32000 


Homo sapiens 


TADA1 protein 


1132 


100 


1059 


AL031778 


Homo sapiens 


dJ34B21.1 (novel BZRP (benzodiuzapine 


920 


100 








receptor (peripheral) (MBR, PBR, PBKS, IBP, 












Isoquinoline-binding protein)) LIKE protein) 






1060 


AF227I35 


Homo sapiens 


candidate taste receptor T2R9 


134 


33 


1061 ' 


Y27575 


Homo sapiens 


Human secreted protein encoded by gene No. 9. 


1392 


100 


1062 


Z1I697 


Homo sapiens 


HB15 


1088 


100 


1063 


AF123757 


Homo sapiens 


putative transmembrane protein 


819 


100 


1064 


AF155135 


Homo sapiens 


novel retinal pigment epithelial cell protein 


2932 


99 


1065 


Y41674 


Homo sapiens 


Human channel-related molecule HCRM-2. 


936 


99 


1066 


AJ250042 


Homo sapiens 


Rab5 GDP/GTP exchange factor homologue 


2575 


100 


1067 


Y36087 


Homo sapiens 


Extended human secreted protein sequence, SEQ 


770 


85 








ID NO. 472. 






1068 


Y94959 


Homo sapiens 


Human secreted protein clone mc300_l protein 


301 


100 








sequence SEQ ID NO: 124. 






1069 


Y94959 


Homo sapiens 


Human secreted protein clone mc300_l protein 


301 


100 








sequence SEQ ID NO: 124. 






1070 


W64535 


Homo sapiens 


Human leukocyte cell clone HP00804 protein. 


2014 


99 


1071 


X03145 


Homo sapiens 


pot. ORF 111 


148 


50 


1072 


AL031177 


Homo sapiens 


dJ889Ml5.3 (novel protein) 


821 


91 


1073 


X82200 


Homo sapiens 


gpStaf50 


249 


62 


1074 


G03213 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7294. 


99 


47 


1075 


Y36233 


Homo sapiens 


Human secreted protein encoded by gene 10. 


506 


55 


1076 


G03187 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7268. 


424 


98 


1077 


L25899 


Homo sapiens 


ribosomal protein L10 


332 


76 


1078 


Y91447 


Homo sapiens 


Human secreted protein sequence encoded by 


898 


97 








gene 48 SEQ ID NO:l68. 






1079 


G01862 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5943. 


290 


89 


1080 


AB039723 


Homo sapiens 


WNT receptor frizzled-3 


1376 


92 


1081 


AB020527 


Homo sapiens 


Na/P04 cotransporter homolog 


269 


100 


1082 


LI 3802 


Homo sapiens 


ribosmal protein small subunit 


499 


80 


1083 


W75098 


Homo sapiens 


Human secreted protein encoded by gene 42 


143 


81 








clone HSXBI25. 






1084 


G03564 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7645. 


83 


51 


1085 


GQ4063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


88 


43 


1086 


AF090942 


Homo sapiens 


PRO0657 


124 


64 


1087 


G00517 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4598. 


129 


41 


1088 


G04091 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8172. 


126 


36 


1089 


AF140631 


Homo sapiens 


G-protein coupled receptor 14 


364 


82 


1090 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


114 


32 


1091 


S72304 


Mussp. 


LMW G-protein 


146 . 


83 


1092 


W88708 


Homo sapiens 


Secreted protein encoded by gene 175 clone 


405 


100 








HEMAM41. 






1093 


W85612 


Homo sapiens 


Secreted protein clone fhl23_5. 


4358 


97 


1094 


Y53012 


Homo sapiens 


Human secreted protein clone pm514_4 protein 


1013 


99 








sequence SEQ ID NO:30. 






1095 


Y92345 


Homo sapiens 


Human cancer associated antigen precursor from 


409 


100 








clone NY-REN-62. 






1096 


AF090942 


Homo sapiens 


PRO0657 


147 


60 


1097 


L24521 


Homo sapiens 


transformation-related protein 


166 


58 


1098 


X56932 


Homo sapiens 


23 kD highly basic protein 


490 


70 


1099 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


83 


35 


1100 


YQ2693 


Homo sapiens 


Human secreted protein encoded by gene 44 


149 


59 








clone HTDAD22. 







127 



WO 01/57188 



PCT/US01/03800 







^nppipc 


V) escrinti on 


Smith- 

Ull 11 U 1 


% 


TO 


No 






Waterman 


Identitv 


NO* 








Score 




1101 


AF1 19851 


Homo sapiens 


PRO 1722 


183 


72 


1 102 


G04086 


Homo saDiens 


Human secreted protein, SEQ ID NO: 8167. 


207 


62 


1103 


G04063 


Homo saDiens 


Human secreted protein, SEQ ID NO: 8144. 


91 


52 


1 104 


X74856 


X4ns 


rihosomal Drotein L28 


128 


69 






mus cuius 








1105 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


130 


62 


1106 


G03133 


Homo saDiens 


Human secreted Drotein, SEO ID NO: 7214. 


122 


48 


1107 


G03040 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7121. 


69 


43 


1108 


AF039942 


Homo saDiens 


HCF-bindinn transcription factor Zhanefei 


744 


99 




AF201951 

ill x~ V/ .1 J A 


Mnrno *:flnien^ 


hieh affinitv immunoelobulin eosilon receotor 


738 


94 








beta subunit 






1110 


AF111108 


IViUiJ 


transient receotor potential 2 


223 


79 






1 11 UJv UJ UhJ 








111] 


AF1 19900 


Homo saDiens 


PR02822 


144 


59 


1112 


Y 1 6589 


Homo sanipns 


A nrotein that interacts with oresenilins 


265 


39 


1 1 n 


G02872 


Homo QflOipns 


Human secreted Drotein SEO ID NO- 6953 


178 


67 


1114 

J 1 In 


Y09QQ9 


Unrnn ?an'pnQ 


Pradmertt nf* human ^prrptcH nrritpin enrrtH^H hv 


164 


63 








pene 121 

CVllv lit 1 i 






111S 


Y3081 1 




Human QprrptpH nrAtpifi pnrodfvl from oenc 1 

lM-lllllCU* Z>v^C'l x#>L^Vi LJlvll'tll V"vvUvU XI will e vllV ^1 ■ 


1217 


99 


1 1 16 


X51394 


Xpnonns 


APRCt nreciirvir nrotein 


130 


40 






Ifievi^ 








1 1 17 


M27826 


Homo saDiens 


nfiutral nrotease larae suhunit 

llvU*4Ul 1^1 vty<l|/V 1 (U JUv u H'* 


442 


65 


1 1 18 


G03371 


Homo saDiens 


Human secreted Drotein SEO ID NO" 7452 


72 


60 


1 1 19 


G03602 

VJ \J-J \J\JJu 


Vlrtmn aniens 


Human secreted nrotein SEO ID NO- 7683 


491 


97 


1 120 

1 1 it \i 


Y35906 


Homn saniens 


Rxtenrfed human "Secreted nrotein s&cucncc SEO 


244 


97 








TDNO 1SS 






1 M\ 


G03714 


Homo Qnnipn<i 


Human secreted nrotein SFO ID NO* 7795 


122 


65 


1 122 


Y00337 




Human secrpteH nrotpin encoderf bv pene 81 


110 

AAV 


90 


1 izj 




flUUlO bapjCIlo 


iwu pure uujiJtaiii rv' uiaiiuci, i rvorv-z 






1 194 

J J Z*t 


AF9 19869 




mpmhmTip t ntprn r*H n o *nrotpin r»fl?f¥^l6 

111C11 1 Ul Uli^ HI LCI aLU jJlv'^t'ill Ul IWJuiU 


449 


88 


1 19S 


W lrrtU7 


nuiliu >apiCllo 


Human Qf/~" rf*t(*{\ nmtwn frn m /*)rtn^ ^W79S 9 


191 


53 


i 1 ZD 


VJU 1 jOI 


numo sapiens 


Hum^n cp^rpfpH nrrktpin *sFO ID "NJO- S449 

o uiuaJi secret cu pruiciii, JHi\j iu jhm-z. 


1S4 


100 

IVv 


1 197 


c,c\ i i/i i 


riomo Sapiens 


nuniai] secret ca pr t/tciii, jl^ ix^, >> t + t fz . 


IAS 


inn 


1198 

J Izo 


i o't-jzu 


riomo sapiens 


riuman caruiovascuiar sysxeni dssuciaica pruicin 


81 S 

□ 13 










VltlOOP-l 

AiliaSC'J . 






1 19Q 


nn9 ins 


numo sup i ens 


Human vf*r*rp\f*A nrntpin SPH TF) 61fi6 

nuiiidji sccrcicu pruicin, ud*^ iij inw. dido. 


88 
oo 


73 






nurno sapiens 


Trartcm pinKranp Mrtmntn rftntflinino nrntpin plonp 
1 1 Oi (MHC1J1UI allC* UUlJldlll LUllLaJJlIIl^ piVJLt-lll UiVJllC 


700 


100 

Ivv 








HPA1S19 
nruij 1 z. 






i 1 31 


V9QR17 


nuinu ><ipiciii> 


Human cvnancp rplotpH olv^*Ar*w\tPin 9 
FLuiiiaii by I lapse EClaLCU giyt-upiuicjii £.. 


260 


01 
y i 


1 1 39 

1 1 JZ 


VOIlvld 

I 3 lO-fr 


rlUlUU aaUlCllo 


Unman cp^rptpn niY^tpin cpnupnpp Ar^ pr\H Prl hv 

nuiiidii i>cvi clc-u pruiciii AdjuciiLrw chlajuc^u uy 


57 S 


96 








pene 43 SFO ID NO*3 1 7 






1 1 VI 


VQ1 440 


t-Trtrvtn enni^nc 


Hum tin QpfrptpH nrotpiti ^pnilpnpp pnpnHpH 
rTUiiiail aCvICLCU piui&iii ^cijucjiuv viivuu^u uy 


542 


100 








eene 49 SFO IDNO170 






1 134 


AT*0 17908 

ADv J r J7\JO 


Unmn ennipn^ 


4F9 lioht rhain 


2399 


93 


1 1 3 s 




nuniu IxipiCIIo 


ziiiV' linger piuiciii ^ joo t\t\.j 


319 


ss 


1 136 

1 1 JO 


Y99496 


Hrtmrv Qftnipnc 

I IVJl l L\J DO u Id 13 


Human PRO!6fl4 fHN078S 1 l amino arid 


917 


72 








seouence SEO IDNO*308 






1 137 


H03790 




Human ^errrferl nmtcin SFO TD NO- 7871 


102 


50 


1 138 

1 J.JO 


AF1SS106 




NY-RFN-36 sntioen 


768 


91 






LJrt rry f\ pom /*fi P 
nuiTIU oapiClli 


/4I95!H9ft 1 /nrtvpl nrr»f^*i n cimilar fA m^irvhirinP 

U^LOnXv, i yllv/VCI LflvlvllI ilttlfirtl IV filC-lilUlllil v 


i J7 

4 1/ 


SO 








trj>n<i'nnrt nrot^inc^ 






1 140 


AF01 nso 


D \Jj tatUUj 


rponlator Af fr-tirrttpin Qionntino 7 
JCrguioivi vl v/ pj t/tViJI ■J'^'ltlllilg / 


138 


96 


1 141 


Y7001R 


nuinu oapicii> 


Human Prrttpficp ann nccrvrintpn nrfttpin-19 


623 


100 








i'PPRG-12 1 i 






1 142 


O04091 




Human serreted nrotein SFO ID NO* 8177 


113 


38 


1143 

A A 


AB 03023 5 


Can is 


04 donamine receDtor 


89 


48 






familiar is 








1144 


Y94922 


Homo sapiens 


Human secreted protein clone pv6J protein 


539 


88 








sequence SEQ ID NO:50. 






1145 


X99962 


Homo sapiens 


rab-related GTP-binding protein 


398 


96 


1146 


G03807 


Homo sapiens 


Human.secreted protein, SEQ ID NO: 7888. 


168 


79 


1147 


G03712 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7793. 


512 


85 


1148 


Y28279 


Homo sapiens 


Human G-protein coupled receptor GKJR-1. 


705 


76 


1149 


U 13642 


Caenorhabditi 


ex on 5 similar to transmembrane domain of S. 


247 


36 



128 
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SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 






s elegans 


cerevisiae zinc resistance protein 






1150 


G03438 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7519. 


117 


62 


1151 


G01003 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5084. 


181, 


80 


1152 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7879. 


198 


63 


1153 


X88799 


Oryza sativa 


DNA binding protein 


95 


41 


1154 


D85245 


Homo sapiens 


TR3beta 


155 


96 


1155 


R74272 


Homo sapiens 


Tumour suppressor protein, p53. 


341 


87 


1156 


Y86265 


Homo sapiens 


Human secreted protein HUSXE77, SEQ ID 
NO: 180. 


99 


41 


1157 


G02577 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6658. 


263 


98 


1158 


AF104334 


Homo sapiens 


putative organic anion transporter 


185 


42 


1159 


G01393 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5474. 


173 


57 


1160 


W75771 


Homo sapiens 


Human GTP binding protein APD08. 


224 


81 


1161 


AF2 16833 


Homo sapiens 


M-ABC2 protein 


410 


83 


1162 


W67816 


Homo sapiens 


Human secreted protein encoded by gene 1 0 
clone HCEMU42. 


1156 


100 


1163 


AF1 19851 


Homo sapiens 


PR01722 


230 


70 


1164 


Y87252 


Homo sapiens 


Human signal peptide containing protein HSPP- 
29SEQIDNO:29. 


113 


31 


1165 


W64537 


Homo sapiens 


Human liver cell clone HP01 148 protein. 


338 


82 


1166 


AF269286 


Homo sapiens 


HC6 


134 


64 


1167 


Y14482 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 17. 


149 


51 


1168 


D90789 


Escherichia 
coli 


Dipeptide transport system permease protein 
DppC. 


411 


90 


1169 


R63783 


Homo sapiens 


TG0847 protein. 


344 


90 


1170 


Y45274 


Homo sapiens 


Human secreted protein encoded from gene 1 8. 


478 


98 


1171 


D64154 


Homo sapiens 


Mr 110,000 antigen 


347 


96 


1172 


AB026256 


Homo sapiens 


organic anion transporter OATP-B 


311 


67 


1173 


G00357 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4438. 


60 


52 


1174 


D87717 


Homo sapiens 


similar to human GTPase-activating 
protein(A49869) 


178 


59 


1175 


M64716 


Homo sapiens 


ribosomal protein 


391 


78 


1176 


RO8330 


Homo sapiens 


Human IL-7 receptor clone H6. 


285 


67 


1177 


L06505 


Homo sapiens 


ribosomal protein L12 


242 


72 


1178 


AJ251885 


Homo sapiens 


organic cation transporter (OCT2) 


276 


88 


1179 


G03258 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7339. 


155 


71 


1180 


G01207 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5288. 


282 


90 


1181 


AF181856 


Rattus 
norvegicus 


tRNA selenocysteine associated protein 


249 


62 


1182 


AF161524 


Homo sapiens 


HSPC176 


138 


90 


1183 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


282 


66 


1184 


Y02671 


Homo sapiens 


Human secreted protein encoded by gene 22 
clone HMSJW18. 


107 


71 


1185 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7878. 


88 


69 


1186 


G03564 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7645. 


118 


46 


1187 


AB032905 


Hylobates 
concolor 


dopamine receptor D4 


96 


37 


1188 


G00956 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5037, 


292 


78 


1189 


G03258 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7339. 


178 


79 


1190 


G03361 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7442. 


324 


76 


1191 


AF 11 7755 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP230 


187 


70 


1192 


Y70455 


Homo sapiens 


Human membrane channel protein-5 (MECHP- 
5). 


202 


67 


1193 


G03052 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7133. 


99 


42 


1194 


G02607 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6688. 


192 


76 


1195 


W29661 


Homo sapiens 


Homo sapiens CI542 2 clone secreted protein. 


2001 


98 


1196 


YI4104 


Homo sapiens 


Human GABAB receptor 1 d protein sequence. 


239 


69 


1197 


X61972 


Homo sapiens 


macropain subunit iota 


149 


90 


1198 


G00534 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4615. 


145 


51 


1199 


Y86260 


Homo sapiens 


Human secreted protein HELHN47, SEQ ID 
NO: 175. 


1089 


89 


1200 


G02607 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6688. 


154 


57 
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SEQ 

1JJ 


Accession 

NO. 


Species 


L^escnpiion 


9mmV 
oj niLij - 

Waterman 

TT ULvl JlJull 

Score 


so 

Identity 


1901 




Homn •iflnipn*! 


Human secreted nroteirL SEO ID NO* 4919 


404 


50 




1M27R26 


Homo cuniPTiQ 




202 


49 


190"* 


Y73424 


Homo ^aniens 


Human secreted Drotein clone vi4 1 Drotein 
sequence SEQIDNO:70. 


265 


61 


1904 


AF264014 


Unmn «anipns 


^caveneer recentor cvsteine-rich tVDe 1 orotein 
MI 60 precursor 


625 


98 


190S 




T-Jomn cani ftns 


Human secreted nrotein #75 


219 


59 


1906 


T f7C t 1 1 
U (Ol 1 1 


(Cathie crnlliiQ 

\J uJl Uj gallUd 


AO 


205 


SI 


1907 


AF0954451 


Homo ^artipn*? 


nutfltive (1 nrotei n-couDl ed receDtor 


416 


76 


190J? 


API I671S 

AVI 1 IU / 1 J 


Hntnn ^anfpn^ 




127 


75 


1209 


AF099137 


Homo sapiens 


MaxiK channel beta 2 subunit 


475 


95 


1 Z 11/ 


.rVTZl/J / 10 


Wnmrt enni^nc 
numu iKiptCIio 




423 


79 


1911 


■I £i /OUO 


Wornr* piVnQ 


Human tiftcrptpd nrotpin pneodpd hv cenfiNo 
107 


224 


70 


1212 


G00719 


Homn ^aniens 


Human secreted DroteiiL SEO ID NO" 4800 


117 


44 


1213 


G01009 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5090. 


351 


73 


1914 
1Z 1*t 


AFfiOOQ47 


JTlUIiJO rSilJJJClJi 


I I\vUUJ / 


124 


70 


1215 


Y 14427 


Homo sapiens 


Human secreted protein encoded by gene 17 

rlone H<1TFA 14 


99 


11 


1216 


G03905 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7986. 


173 


57 


ton 
izl / 


V^90Q9 


nomo sapiens 


riuman uansmemDrane protein ni iyutin-zi. 


1 1 1X 


100 


Izlo 


TOO 1GA 


riomo saptens 


nia-ur antigen aipna cnain 


*tJ*t 


78 


loin 


Yjyvuy 


Homo sapiens 


becretea protein /o-zo-j-a i 4-r l* i . 


470 


Q9 

yz 


1 OOO 

IzAJ 


WoIj /O 


Homo sapiens 


r,ts v-inauceo o-proiein coupieo receptor ^jDtsi- 
t. ) poiypepiiue. 


//J 


i on 


1221 


W96745 


Homo sapiens 


High affinity immunoglobulin E receptor-like 
protein ^iojdisjjJ. 


650 


98 


Iz/Z 




xiumo s<ipien;> 


LjXicnuca iiumaii bccicicu pruicin bCLjuciHjC, oov^ 
ID NO. 160. 




J 1 


1993 
I ZzJ 


I \A/Z /o 


iiomo sapiens 


riuman secrcicu protein eiicuueu uy gene zi . 


9^0 

ZOv 




lzz4 


ATM £1 A99 

At loi^zz 


nomo sapiens 


rlorL.j'J'f 




QO 


1 99S 
1ZZJ 


T 11 4070 


U Arvrn O 4*%* Art O 

riomo sapiens 


miKa c*r*\ m ol nro4Aln ^ 

riDOSoniai proiem 


909 




1226 


GO 1733 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5814. 


610 


100 


lZZ/ 




svius 

muscutus 


scniaienz 




SA 

JO 


lzzo 


on mo 


Homo sapiens 


riuman secreiea protein, ojca^ il/ inu. jzyy. 


i 


Rl 
o 1 


1229 


AF217188 


Mus 

rnuscuius 


YIP1B 


801 


63 


i9io 
IzjU 


Ar l /Ool j 


riomo sapiens 


soiuDie auenyiyj cyclase 


97^ 

Z / J 


100 


191 1 
IZj 1 




riomo sapiens 


organic canon u ansponer 


1704 
1 iKM 


too 


1919 
I ZjZ 


W / '-f 7 J_ 


riomo Sapiens 


nunian MsCrcicu proiem eiiiAJucu uy gciic / / 
clone HOEAS24. 


919 

1Z 


s^ 


1 921 




riomo sapiens 


riuman sccreieu protein cioue yioz_ i proiem 


^96 

JZD 


1 00 
1 uu 


1234 


U76618 


Mus 

ill U6L>UlUo 


N-RAP 


482 


82 


iz jj 








JOv 


97 






Hrtmri citnipnd 
OUII1U MdtJjlGJiO 


Human Qprrrfpd nrotein SFO ID NO* 5S40 


417 


100 


mi 


AF000018 


Homo saptens 


adapter protein 


164 


84 


IZjo 


W 60OJJ 


nuiiio bdpiciio 


V £hr*rcktc*ri v\rc%tt»in pnr^An^n rWJ rr i^n P 1 1 1111 fMnnP 

occrcicu pruLciii chluucu uy gciic iuu ^.luiic 


250 


90 


1239 


W29660 


Homo sapiens 


Homo sapiens CH27_1 clone secreted protein. 


697 


98 


1 9^IA 
Jz'rU 




vjryciojagus 

/*im i/^ii ft io 
CUJ11CU1US 


per uxi^orniij i^a^tjcpejiucJii bujuic cajx ici 


ISA 


59 


1241 


Y99710 


HomO QPrtiPtlQ 


Hiinrifin rn^rnhrflnp-flQQfiPifttftrl nrotpin ^^1(^2.4 


709 


97 


1242 


Y95002 


Homo sapiens 


Human secreted protein vc34_l, SEQ ID NO:44. 


908 


88 


1243 


Y44905 


Homo sapiens 


Human potassium channel molecule ERG-LP2 
partial protein. 


325 


100 


1244 


AF2 84422 


Homo sapiens 


cation-chloride cotransporter-interacting protein 


511 


97 


1245 


Y53629 


Homo sapiens 


A bone marrow secreted protein designated 
BMS115. 


1888 


93 


1246 


AB039371 


Homo sapiens 


mitochondria] ABC transporter 3 


389 


97 


1247 


Y35911 


Homo sapiens 


Extended human secreted protein sequence, SEQ 


168 


39 
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SEQ 

ID 

NO; 


Accession 
No. 


Species 


Description 


Smith- 

Waterman 

Score 


% . 1 

Identitv 

* VI V>> 1 LI IT 








ID NO. 160. 






1248 


AF072509 


Rattus 
norvegicus 


giutamate receptor interacting protein 2 


559 


90 


1249 


AF247042 


Homo sapiens 


tandem pore domain potassium channel TRAAK 


661 


98 


1250 


B08974 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 27 SEQEDNO:131. 


1087 


97 


1251 


L15313 


Caenorhabditi 
s elegans 


putative 


858 


59 


1252 


Y29338 


Homo sapiens 


Human secreted protein clone it217_2 alternate 
reading frame protein. 


278 


75 


1253 


W01730 


Homo sapiens 


Human G -protein receptor HPRAJ70. 


211 


92 


1254 


G03074 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7155. 


294 


83 


1255 


G01818 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5899. 


253 


91 


1256 


AF286368 


Homo sapiens 


eppin-1 


222 


54 


1257 


AF220264 


Homo sapiens 


MOST-1 


87 


93 


1258 


G02227 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6308, 


281 


78 


1259 


Y07970 


Homo sapiens 


Human secreted protein fragment #2 encoded 
from gene 26. 


81 


94 


1260 


R95332 


Homo sapiens 


Tumor necrosis factor receotor I death domain 
ligand (clone 3TW). 


986 


100 


1261 


AF140674 


Homo sapiens 


zinc metailopro tease ADAMTS6 


172 


36 


1262 


U28369 


Homo sapiens 


sem&Dhorin V 


237 

+*m? i 


67 


1263 


Y07049 


Homo sapiens 


Renal cancer associated antigen precursor 
ceouence 


288 


71 


1264 


Y36153 


Homo sapiens 


Human secreted protein #25. 


187 


80 


1265 


Y78114 

A r VP A A T 


Homo ^anipn^ 


Human rvtnlrinp <?ipnfil reoiilfltnr PJf SFO 
1DNO-2 






1266 


Y 13397 


Homo saniens 


Amino arid ^pnnpnrf* nf nrntpin PH0^^4 




ion 


1267 


AF030558 


Rattus 
norvegicus 


phosphatidylinositol 5-phosphate 4-kinase 
gamma 


859 


95 


1268 


U73167 


Hrtmn QflnipnQ 


rflnHirfstp tiimrvr cnrviM"f*Qcor trpn** T ITfA— 1 




Oft 


1269 


AF190664 


Mus 

rniiccnliie 


LMBR2 


552 


76 


1270 


AL050332 


Homo saciens 


dJ570F3 1 ^homrfclnp the rat wnantir 

GTPase-activating protein pl35 SynGAP) 




OR 


1271 


G02126 


Homo ^aniens 


Human secreted nrotetn SFO ID NO* 6207 


13 1 




1272 


AF125533 


Homo ^aDiens 


^AOH-cvtAchrnrne rcdncta<^ t«;nfnrm 


J J 




1273 


AL035661 


Homo <i3DiGns 


dJ56RC!l 1 3 f novel AN/fP*hinrfme enTvme 
similar to acetyl-coenzyme A synthethase 
(acetate-coA ligase)) 






1274 


AF064748 


Mus 

musculus 


S3-12 


3523 


61 


1275 


D17554 


Homo sapiens 


TAXREB107 


377 


78 


1276 


Y30715 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


643 


90 


1277 


AF1 46760 


Homo sapiens 


septin 2-like cell division control protein 


707 


100 


1278 


Y05069 


Homo sapiens 


Human PIGR-2 protein sequence. 


281 


46 


1279 


X59668 


Oryctolagus 
cuni cuius 


aorta CNG channel (rACNG) 


267 


85 


1280 


G01051 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5132. 


489 


- 

98 


1281 


G03411 


Homo sapiens 


Human secreted Drotein. SEO ID NO: 7492 


120 


43 


1282 


AF055084 


Homo sapiens 


very large G-protein coupled receptor- 1 


1635 


100 


1283 


AF1 17814 


Mus 

musculus 


orld-skvroned relatwi 1 nrrrtMn 




70 


1284 


U87318 


Xenopus 
laevis 


NaDC-2 


535 


60 


1285 


AF061346 


Mus 

musculus 


Edpl protein 


452 


68 


1286 


AB030182 


Mus 

musculus 


contains transmembrane (TM) region 


582 


68 


1287 


A13595 


synthetic 
construct 


immunosuppresive protein PP15 


185 


97 


1288 


AF254411 


Homo sapiens 


ser/arg-rich pre-mRNA splicing factor SR-A1 


837 


100 


1289 


AF084205 


Rattus 
norvegicus 


serine/threonine protein kinase TAOl 


319 


98 
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in 

NO: 






L/Covi 1UL1U11 


k^lll ILII 

Waterman 
Score 


% 

Idpntitv 


17Q0 


AF038563 


Pnmn vinipn^ 


memhninp associated puanvlate kinase 2 


523 


100 


1291 


AF034837 


Homo sapiens 


double-stranded RNA specific adenosine 

UVH 1 1 IltllUJv 


468 


100 


1292 


Ml 5888 


Bos taurus 


endozepine-related protein precursor 


937 


87 




AR01fViQ9 

rlDU J \J\j7Z, 


AmhtHnncic 

thaliana 

%A i w-i ► m i u 


A TP-HpnpnHpnt ftt^A hf^lirnse-like orotpin 


636 


45 


1294 


AF209923 


Wfimn ^Afiipn^ 


omhan CJ-nrotein counted recentor 


1570 


100 

A V w 


1295 


W67828 


Homo sapiens 


Human secreted protein encoded by gene 22 
clone HFEAF41 


504 


98 


1296 


AC004832 


Homo sapiens 


similar to 45 kDa secretory protein ; similar to 
CAA10644 1 fPID*e41644l 8^ 


648 


65 


1797 


XR0035 


Orvrtnlaoiic 

\Ji J VrlLrJ tig L4»> 

rnnictiliis 




575 


70 


1298 


G02645 


Homo saniens 


Human secreted Drotein SEO ID NO" 6726 


223 


97 


1299 


Y59440 


Homo ^aniens 


Human delta3 fragment ^4 


122 


32 




W705O4 


Homo ^ani(Mi^ 


I pnVnrvtp cpv^n timp<; inpnihrafiP-Dpnptr/itfnQ 

type receptor protein JEGI 8. 


459 


81 


1301 

< Jvl 


Y67315 


Homo ^aniens 


Human ^pcreted nrotfin RI.R9 13 amino acid 
sen u en ce 

kJ WVJ \A vl 1 V V i 


3916 


99 


1302 


M77693 


Homo *?artien^ 


snermidine/snenTiine Nl -acet\'ltran^ferase 


174 


96 


1303 


G01331 


Homo ran fen ^ 


Human secreted Drotein. SEO ID NO* 5412 


254 


69 


1304 


G01491 


Homo sapiens 


Human secreted protein, SEQ ID NO; 5572. 


747 


99 


1305 


AF 1 48509 




alnhn 1 7 -mnnnns i 

en yji ici i^^'iiiajutuoiutuv 


602 


98 


IJw 


nnifi58 

uu i ujo 


rcumu sup i ciio 




333 
j j j 




1307 


Y90899 


Homo sapiens 


Dl-Iike dopamine receptor activity modifying 

nrotein *?FO ID NO I 


332 


98 


ijuo 




nviliu aapidlo 


n'S'? rpmi taf^W PA7A-T7 nui'lptir T%rrvt**in 






1309 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein sequence. 


147 


66 


IJIU 


A Tift A 3 0/l "3 


dos laurus 


noosortiaj proietn lju 


9 OA 


on 


1111 




iTiiicr > n!i)c 

HllJjLrlilltO 


aTbcniic inuuciuic tvrN/\ assuciaicu pruLcui 


Ooo 


/U 


1312 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein sequence. 


1154 


100 


1113 




jiumo Ddpiejis 


jjuman n\ui /ou ^uinv^o**x^ ammo at-iu 
sequence SEQ ID NO:282. 


114^ 

1 1 


7R 


1314 


Api 


riutno odpiciio 


PR01777 


433 


Q7 


1315 


W75100 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HE8CJ26. 


807 


97 


1316 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


789 


100 


1317 


AB041533 


Homo sapiens 


sperm antigen 


2607 


98 


1318 


U19617 


Mus 

musculus 


bll-1 


OA/ 

806 




1319 


U82598 


Escherichia 
coli 


ferric enterobactin transport protein 


768 


100 






r^scneri cilia 

COM 


(GLUCITOL-6- PHOSPHATE 
nFHYnROnFNASF't fKFTOSFPHOSPHATF 

RFDlJCTASE'i 


/UV 


inn 


1321 


W67847 


Homo sapiens 


Human secreted protein encoded by gene 4 1 
clone HPBCJ74 

VlvlJv A AA JU>\_^ J / » * 


601 


92 


1322 


AJ276101 


Homo ^lanien^ 


GPRC5B nrotein 


466 


93 


1323 

A — * J —- * 


AJ276101 


Homo saniens 


GPRC5B nrotein 

VJ1 1VV^'«/1-' L/I V/LV ill 


504 


97 


1324 


Y58628 


Homo santen** 


Protein reGnlatino opnf* evnrc«iftn 


1584 


100 


1325 


U91561 


Rattus 


pyridoxine 5'-phosphate oxidase 


1277 


89 


1326 


AF175S33 






1606 


100 

I \J\J 


1327 


Y32206 


Homo samens 


Human receotor molecule ^REC^ encoded bv 
Incyte clone 2825826. 


1531 


90 


1328 


AF151048 


Homo sapiens 


HSPC2I4 


657 


85 


1329 


Y10530 


Homo sapiens 


olfactory receptor 


1645 


100 


1330 


AFI80681 


Homo sapiens 


guanine nucleotide exchange factor 


4314 


99 


1331 


AF111856 


Homo sapiens 


sodium dependent phosphate transporter isoform 
NaK-3b 


3591 


99 


1332 


Y13583 


Homo sapiens 


G-protein coupled receptor 


2171 


100 


1333 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 
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SEQ 

ID 

NO. 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1334 


Y25755 


Homo sapiens 


Human secreted protein encoded from gene 45. 


1380 


96 


1335 


AF 152325 


Homo sapiens 


protocadherin gamma A5 


4742 


99 


1336 


X74070 


Homo sapiens 


transcription factor BTF3 


639 


81 


1337 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1931 


95 


1338 


G03877 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7958. 


621 


100 


1339 


AL008582 


Homo sapiens 


OK223H9.2 (ortholog of A. thaliana F23F1 .8) 


626 


100 


1340 


X61615 


Homo sapiens 


leukemia inhibitory factor receptor 


5820 


99 


1341 


Y01519 


Homo sapiens 


A carcinogenesis-inhibiting protein. 


7528 


97 


1342 


AF207600 


Homo sapiens 


ethanolamtne kinase 


2372 


100 


1343 


U54807 


Rattus 
norvegicus 


GTP-binding protein 


1167 


97 


1344 


AC020579 


Arabidopsis 
thaliana 


putative phosphori bosy Iformy lgly cinam t di ne 
synthase; 25509-29950 


3283 


51 


1345 


Y28576 


Homo sapiens 


Secreted peptide clone pe503_K 


944 


100 


1346 


W74787 


Homo sapiens 


Human secreted protein encoded by gene 58 
clone HHFHN61. 


1171 


100 


1347 


M55542 


Homo sapiens 


guanylate binding protein isoform I 


2636 


87 


1348 


AF 183428 


Homo sapiens 


28.4 kDa protein 


1329 


100 


1349 


U70669 


Homo sapiens 


Fas-ligand associated factor 3 


167 


24 


1350 


AF295530 


Homo sapiens 


cardiac voltage gated potassium channel 
modulatory subunit 


562 


99 



TABLE 3 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO; 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino, acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=Glycine, H=Histidine, 
J-Isoleucine, K=Lysine, L=Leucine, 
M^Methionine^N^Asparagine. P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown } *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


I 


1351 


A 


2 


337 


1 


TPSLIHQAPTPCPAGLWG/PPNGHYHGS*PGC 
H WPQAPHRA* * *GLLPPR WLGHGLPGGPAAP 
WAASQWVDGVAGRLPGPAWSWHASGAAPA 
QPGPL*LLVPGSSGLPDPRDP 


2 


1352 


A 


27 


100 


366 


iRNSSrRPMKJERJETKLSAKHMITCSASYDlRGL 
QIETT\YHHTPIRMAKIQKT/GHHQC**ECGAT 
GTLIHG W WG CK WEPLGKTVWQ1PK 


3 


1353 


A 


40 


3 


314 


. HASAHASVVLKDNSELEQQLGATGAYRARA 
LELEAEVAEMRQMLQLEHPFVNGADKLRPD 
SMYVHLNEL*QSLVENMI_,LTVVDTH\RTPI*R 
SCNYTLALILFL 


4 


1354 


A 


74 


2 


292 


TASALFSCPDGGSLAGFAGRRASFHLECLKR 
QKDRGGDISQKTVLPLHLVHHQVAHTFGQAT 
VTCQQARQSPG * RTNPE/ALQW VLP VSDG WH 
VLPLP 


5 


1355 


A 


78 


114 


850 


ENCRVASNLPGVFFSEDTAQSGSYMRISAHPP 

NAGGEVSNGPKRKLTLMLNFSLPSSGLNAGA 

FYALSTLLNRMV1WHYPGEEVNAGR1GLTIVI 

AGMLGAVISGIV\^DRSKTYKJETTLVVYIMDT 

GGAWWCYTFYLGTGDTCG* CFITAGYTMGFF 

MTGYLPLGFEFAVELNSYPESEGISSGLLNISA 

QVFGIIFTISQGQIIDNYGTKPGNIFLCVFLTLG 

AALTAF1KADLRRQKANKETLEN 


6 


1356 


A 


81 


97 


376 


EWFSYMLGSNMSVYHSP*SLEPLCKVLSES*A 
YLRVPFIRILLNAR * IRKA YKRMS LE JKLLI/RE 
*CLFQEMGLSLQWLYSARGDFFRATSRL 


7 


1357 


A 


93 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQJLLVWYP 
ATALADNKPVAPDRR1SGHVGI1FSMSYLESK 
GLLATASEDRSVR1WKGGDLRVPGGRVQNIG 
HCFGHSARVWQVKLLENYLISAGEDCVCLV 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, 
D=Aspartic Acid, rv=Oiutamic Acid, 
F=Phenylalanine J G=Glycine, H=Histidine, 
I-Isoleucine, K-Lysine, L=Leucine, 
M=Methionine, N-Asparagine, P=Proline } 
Q=Glutamme, R=Arginine, S=Serine, 
T=Threon ine, V= Valine, W=Tryptophan, 
I — i yrosinc, a. — unKiiovvn, — oiup cooon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














WSHEGEILQAPRGHQGRGIRAIAAHERQAWV 

I x OOi^jL/O Olivia, Wrlij V v_T IVvJ i jVvJLO/ LyL^Kj OL^i->\^ 

VP**ARYTQGCDSGWLLATAGSD*YRGPVSL 

RIVCYGQWGRSCQGCPHQHSNCCCGPDPVS 
WEGAQLELGPAWL 


8 


1358 


A 


106 


3 


350 


FSSLLSGR1STLRDETGAILIDGDPAACAPIIKF 
LLTEELHLRGVSIYVLRHEAQIYGITPLWCAL 
LI/CRRL* SDSCMRAALKDRGL YQVLILDGLV 
QCLGFVDSDSRKMVSTLT 


9 


1359 


A 


115 


49 


186 


QAWAIFKGKYKEGDTGGPAVWKTRLRCALN 
KSSEFNEGPERERMDV 


10 


1360 


A 


123 


2 


1249 


KGCRTQEKVDRTEVIRTCINPVYSKLFTVDFY 

FEEVQRLRFEVHDISSNHNGLKEADFLGGME 

CTLGQIVSQRKLSKSLLKHGNTAGKSSITVIA 

EELSGNDDYVELAFNARKLDDKDFFSKSDPF 

LEIFRMNDDATQQLVHRTEVVMNNLSPAWK 

SrTCVSVNSLCSGDPDRRLKCIVWDWDSNGK 

HDFIGEFTSTFKEMRGAMEGKQVQWECINPK 

YKAJKKKNYKNSGTVILNLCKIHKMHSFLDYI 

MGGCQ1QFTVAIDFTASNGDPRNSCSLHYIHP 

YQPNEYLKALVAVGEICQDYDSDKMFPAFGF 

GARIPPEYTDSHDFAINFNEDNPECAGIQGVV 

EAYQSCrAPKAPTFTGPTNICPHSSRKVAKFRR 

cp/TirATr/vn <t t"*a Titbit %/rtD/T/^i/( r, l/vC<*»rMkjf/^l> 

SEGN* HQG RAFAI IF! L VDr (JQ VO V YoQJJMGr 
DNPGGHFV 


11 


1361 


A 


147 


614 


9 


ACARKQLLGRTVFIWFVGQLLGGELKGYSKT 

NTTSSRPASSRG\TLSSSSSSSSSLTKDALPSSL 

KSDSTTITSGLVFPFRSLCVNPAKSSVSESVSSI 

KILLS SS VK Y LE * K_RTS> CCrrD a bB bKL SQL 5>5> 

DERVSMGTSSRKPTNSSSSLGALKMSATS\*G 

SGSESPTPFFLTGLQSPPSTRPREPGLTTARNS 

TTLTRDC 


12 


1362 


A 


177 


12 


416 


L1PSEPALDSLVDPRVRSRKQPFVIYPVYDTAI 
DTKJHr SLLDGN VGEr L/Mb Aur C r NHKAAM 
VLFLDRVYGIEVQDFLLHLLEGGFLPDLRAA 
ASLDT/A£IGAMDFLLS*LFTLCLMMFFFIYPFI 
NLLTMNVY 


13 


1363 


A 


249 


535 


105 


W I r HRH LoP AxL [ V L DQO l\s V V a Y Y rvJN 1 V 

MPDTQMEQGLN/HLFLDGNA*PHSVECYCPS 

TFEIAIKJTSFVLYFHRYRAPEVLLRSSVYSSPI 

DVWAVGSIMAELYMLRPLFPGTSEVDEIFKIC 

r\\/\ nTPVVvcn vpvt i 
V<j V l,Vj l riSJS. VolLV r rvL.lv 


14 


1364 


A 


254 


572 


201 


YLLTXIGNLMMLLVINADSCLRTXM*FFLGH 
FFFLDIC YSS VTAQDAAEFPVS *KPIL VWG YIT 
*SFFFIFSWGTNGCLLSAITYACYAAICHPLLS 

IMVlVllNKrLLl AJ ViN/\l INIvlVlVjrL/lNOV^ VrN 


15 


1365 


A 


257 


425 


68 


THAKFLNKKFN I PKL V ILPKL V Yl V KAI PTKM 
Alh,rLLbL.UQNI 1 US-LlL-bN 1^ JvNlAKfNl^JNJCKV 
TFTPIET*HPVKQMIKWQ*LTAWLRNRGYKKI 
KOTPNSFTAPSVORNLVFDK.CG 


16 


1366 


A 


263 


104 


481 


FCIFRTTEEDRGGDDCWSVWTKQRNNSCVK 
SKDVFSKPVNIFWALEESVLGVKARQPKPFFA 
AGNTFEMTCKVSSKNIKSPRYSVLIMAEKPV 
GDLSSPNETKYII SLDQDS WKLENWTDASRV 


17 


1367 


A 


298 


68 


208 


RKRTNNPIKLDKKFEHFKNEDI+ITSKHTKMW 
VSSLAMKEMLTKTTM 


18 


1368 


A 


300 


904 


1 


LWGITGTRHHARVIFEFLVETGFPHVGQAGL 
EXLTSGDPPALASQSAGITGMSHCARPKGHFG 
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SEO ID 
KO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

IDKO: 

in 

USSK 
09/496 
914 




Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 

nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence ( A— Alanine C=Cvsteine 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S=Serine > 
T=Threonine, V=Valine, W^Tryptophan, 
Y«Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, ^possible 
nucleotide insertion 














IHLK*MFYTMSQKMP* PTINLILLLUPGNLNIF - 

KPNMGWLGPKTAFV*KDEVLSGIPFAKGRCR 

WK*DY*OLQEVTDPIMEKGKKKKRTASFFK 

GQPHQSTNALLRRCVR*RYHLS\TVETAGLP* 

KNTGHIPGQPFLFKLWKC*KVICI**QYKW*Q 

MGVKNKSFCPH*SSSPSL*FIGHHSRNF/CSFK 

TEPHSVVQAGGQWRKLSSLQAPPPGLMPLSR 

1SLMSSWDYRRPPQ 


19 


1369 


A 


302 


3 


445 


NSPSRWAKIQMFEHTFCG* GCG/ER/NVHIHCS 
WICRLRPLLWRAVREYLSKLKNAELSFDPGV 
SLLRIYAIDMPTSI*DEKEALLFAFLAFHE*HC 
KSRIWA VIQ/CIHL WDW LRKL* CFHRMKF YA 
AV*NKPRHLLSHIWKDVQKILLK 


20 


1370 


A 


304 


1 


1339 


FFFCGKEVPLFEQNKHPGPRATTSPGA/HARA 

LLSAGEFTAGVGLSP*AIHSFVWLCTFIQHGA 

GGPCHQPGGSPGPWMHTTQAGHLWEGAYPG 

GSSTWHQVPGQLGGSWGPRERSLLGSFIKCSP 

CPHPPGFRLWMSPKQKPPTEKPGVMGRVWR 

1 A/TPflF^Pl TWFAFfiK"FriWT ^PPfiOfiHSF/PVd 

PLHSSLGKTVKP*PKNQKPKQNRSRHGQ\GF 

MAGQGQSRPAAR*PPCPALTPASHSAGTWPP 

RICRTVPGGPCPSPSGFRSCRR*GFSA*TRSWP 

DAEPPSTPDTAPRCCTQSDTSSQGPQ+S* WRR 

CRALPGRLCSAPAAGLRRARPRLSESRRGNSP 

PASPAAASARCPSWGPSCPARPPSRPAAGTEP 

AAPSRCTAWLRGEREPGPRPPGRRPRSGRGP 

VSFAPEVLSLPAVRQTKSWRWRKEEEITRPW 

AJ VRSRGG 


21 


1371 


A 


326 


799 


1587 


GSQ VLPPPPSQDSATLPQDA* GPRAAPGQPVC 

E* GLQG AGVRRLRGEVLCQPQP* G AL*EQCLP 

HLSFSPRQGAAPDTEPSAWGPAPTGATGPGLP 

LRHVRLFSAGAPRGAATPCPPALLHGPAWPP 

ARPMFRGHPPVRPLGPWGKVAAGPRALCLA 

G VPA VO G EC ATICPS G* G I* PA HLRGPPGPF VI 

QWHWQLSAGRDPVPAEDPPL*EGPLGPGGPA 

AAQAEPGADPEPEDKDQAAESRPAGAMSLSA 

QGSGPVGGQGLR 


22 


1372 


A 


327 


146 


652 


PHLENPHPEHSFPGAPLT*STLSWSILSPREPSP 

GAPCYPGHPHLENPHLEHLLTWRTVTWSTI L 

PGAPCYPEHPHLEHPLTWSTPHLEHPSPGEPL 

SCRTPTRSILHRDHPLP*CLSTEESPI*GWGSLP 

APPSTPLVLDVAPPGPQPASSCPGRDSCYSVP 

GTVVSP 


23 


1373 


A 


348 


397 


2 


CIVSSCQGTRKPCHLEDAKKINKQSPTLEK1ES 
LQESL* VKQ* LI VAEK YVQILHPRKKYFQRPL 
N^KRKMKKRKEEKKKCRERMQRRSKWRR 
EEKKE*RREE\EERKKEKEDRKERRKETSPRG 
SRRLLRD 


24 


1374 


A 


362 


170 


352 


GRALDTAAGSPVQTAHGLPSDALAPLDDSMP 
WEGRTTAQWSLHRKRHLARTLLVSRVRGPQ 


25 


1375 


A 


384 


373 


128 


YLITTILETGYLWKNRHSDQ*KRTENPERDQH 
KYPKVDFCKSKSMKNRLCNKWHWTNWIFTD 
KKINLNLKPHTKLTPNIKKN 


26 


1376 


A 


397 


383 


165 


EVKKTOPFIFSG*rNLTIWIRSI*RKSDEJKQRTK 
♦MEKYSISIJ3RRLNTVKMSFLPNLI YKFNT1 SI 
KIPANF 


27 


1377 


A 


406 


103 


380 


KSKATGYMVN1*KLIVNFLYAKDEQLE1EMNK 

IV^^r^GSKNKIAF^^^-TKYQNIQ^^<HAE 

LVNK1EDLNKWRKVLLSWIGRRK1INTMT 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D^Aspartic Acid, E=Glutamic Acid, 
F=PhenylaIanine, (^Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, **=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 


28 


1378 


A 


408 


14 


427 


TICTNKi^fNLDEIK/FLERHKLSKLTQEEVENL 
3TLKTSRETEL VINK* VJPHKEKPGPDSFTGEF 
YQTFKEEL/II/ILHKLFQTIKYGR1LPNSVYETSI 
TLKPKPEKDL\KENYRPLPLSN1DAK\LNKTLA 
NRI**HIR 


29 


1379 


A 


434 


395 


128 


1YSKMCMERQRLNN* ILKKNKVRGIAVPDVK 
VYYKPTVIK/TSWIL*KDSHIVEWNRLENLEID 
PN/IKRLILDKGAEATEWRKDSFFRQWQ 


30 


1380 


A 


455 


2 


228 


FFFETESHSVTQAGVQWCNPGFKRFSCFGLSS 
S WDYR YAPPRP\ANF\* FLVETGFYYVAQAGL 
KLLSPGDLPALAS 


31 


1381 


A 


462 


393 


2 


QLMFDKGVKNIH\WG\VTPPFTX*YWKNWISI 
CRRMNLNPYLSRYIKJNSR\KDLTVRPEPIKLV 
EENTGKTIQDTGLGK*FIAKTSKAQSTKTNK* 
KRQTRYIKLKXKKSTASKENNRVKRQPLE* EK 
IF AN 


32 


1382 


A 


474 


125 


471 


VKPYEIAVFLVKPIEYK*HLLSDPAIPLSG1*LK 
EIKAYT/RRJCTPMFAAPVSVIA/RN*KQSK/CQ 
KQ * YVHRME Y YTTI KRSEDLI C Fi " 1 WVDFRNT 
ILRETDRIHKTTYDVISLI 


33 


1383 


A 


488 


1825 


2 


KSACSFICSEEQPASPSPLKPGTYASETARPRDP 

HAAGPRRDSSEAETRRPRGA/DGSGTVVKGT 

PGSPAPPCSWGHGG\ETEGAG*CPAAPGTDLR 

APGGSAGS*\GLPSAGGSRGRKGWRAAGRQP 

STR*GRPGRHGGRGE*AGHPEPRQSALQSAG 

L/ASSPEPMGAALAEDGSGDSRGAGPRPQE*P 

PSVLSRS\GS*G*G*AASGTASSPRSHSSRLGPP 

SAGFHGLRCGQPPFAAAPPGPWPGTGRPAGG 

AG SPP AAAGTAPPATRGAQSRRQNRTAGRNA 

SPQTAAGAGSPVQWALSRATG*TGETGSWC 

AGGTHQATHLTAAWVCPPTWSVRPGGSGPA 

AGLGR+GRHPAQSPPLPVPRG*PAWPQEAPSP 

SPASSEVALSSGSCWPDQAPGPARGSPPAPLA 

PA WPAAGRGRQR* GRQSAHPPPRR* STAVSL 

SGTS*WRRSP*AGTRTQQC*SPWLVPACSSRP 

L*RGTRRPSTQQSPQTTGTTGRSAGPGHPRS* 

GGRSPAGTGHLGAQTVASPH*GHWPTALSCL 

WASASPPGPEAPPQTGACIGTNCRYRAASAR 

RSSVAPACA*GWQ*AGSPPAVLRGPP*RVRER 

GALTHRPRAPDE 


34 


1384 


A 


497 


422 


2 


APGASVGRAQAAEG*RGGPTGRPPSALGVS/E 
AGRAGRAGEGRPVPPAYPLCKSAQTSGPPKA 
RLSVPPLASCGGRGPPGGAACATCAPPAGPAR 
SSRCRRRSPPE*GPR*PSRPARPSPGSAASRRQ 
KLTPCRCQFRG LCA 


35 


1385 


A 


509 


156 


475 


PTP YPGE* QAAFLLRGPGLRPPA/DPSLR/HRN 
LTELWAVTDENIVGLFAALLAERRVLLTAS 
KLSTLTSCDHAFCALLYPMRWEHVLIPTLPPH 
LLDYC*CPPLPRT 


36 


1386 


A 


512 


3 


1631 


FFFSFVCHLYCVSPTPGPHGRLATWL/PGLLA 

FLGLAAGGQTLCPAGELPGHARAQASGAPGS 

VLIAVPGRRRVHTCGPGPAAPSTRGECPPPAL 

GHTRPARPRPV\PFAPAVPQEPGGQGHGAA/P 

PATGHSAPRGCPPARAAPTGSATPAPPPAACA 

AFHSAWSVPPAGRQQG*RVPAPAFRRTTPGT 

PGQHL LDRPG APP AQG SGPAPAPPPRL AG PA 

GPAAPPPGPPAASWHSSLSKSSSSUGWSPPLP 

VGPGSLQ*TPPPQGPHLSGSCGGTSSWRGQR 

AAVARRLRSWNACGLSRVAGRSSASYPGRE 
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NO: of 
nucl- 
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seq- 
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seq- 
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hod 


SEQ 
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nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 

location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, 
D=Acnartic Acid P=Glnt»mic AetH 

F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y*=Tyrosine, X^Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














grpsqsq*pagppgmrgcclrgw*psssgsd 

gpgphpastwlragktgpsppacgca*lppps 

vsaapqsprtrcprgcaaaaglcvlaaagas 

hga\glpgvrvhtqrvhih*gag/gcqtprpr 

lrslpvlglpaprcpvsahpwhrrsgsscha 

ARLVPRHPAPGCP**TG*\PLITGFPEP*A*GLP 
NHQAVGLEASGALQAGHRDELPTMVQLLDH 
SPDYPLKGRPHAP 


37 


1387 


A 


620 


828 


1 


FRLPLAAGA/RGAAFPRVAVSMAPnP<!Al»frH 

WEASPEMQSKCHQKGKNNQTECFNHVRFLQ 

RLNSTHLYACGTHAFQPLCAAIDAEAFTLPTS 

FEEGKEKCPYDPARGFTGLIIDGGLYTATRYE 

FRSIPDIRRSRHPHSLRTEETPMHWLNG*EDE 

AODDGG*GTISSFLI PWPADHPTPK<;PGFPVH 

SIPVCCQVRGQPQSGGKESPACLKSLSNCLTH 

VDAEFVFSVLVRESKASAVGDDDKVYYFFTE 

RATEKESGSFTQSRSSHRVARGIPPL 


38 


1388 


A 


739 


1 


427 


FRAMVSSTLKLGISILNGGNAEVQ/QGNRGKG 
TSEEGKEG*EVPV*LPVSPPLPRPLQKMLDYL 
KDKKE VG FFQSIQALMQTC\GEKVMADDEFT 
QDLFRFLQLLCEGHNNDFQNYLRTQTGNTTT 
INIIICTVDYLLRLQESI 


39 


1389 


A 


767 


1 


1030 


TLDLTGPLLLGGVPNVPKDFRGRNRQFGGCM 

RNLSVDGKNVDMAGFIANNGTREGCAARRN 

FCDGIIRRQNGGTCVNRWNMYLCECPLRFGG 

ICMPFOriFWPA ^<5[PPVTA AWPAI I T nA/PPTT 
rv.iN^nA./vj.c w rAoolrr V 1 /\J\V\ tlf\L,L>L>L/ V ru 1 1 

VRGLHIQVRQPLVVYAAFTVDSHRPLQETVL 
RRAPAPASGVPSPSGVGWDR+AGPAEPSPSTP 
ATVIISVPWYLGLMFRTRVKEDSVLMEATSGG 

PTSFRLOVTGAprHOfiTr*vnARriRnp?v4T <zn 

LRVTDGE WHHLLIELKNVKEDSEMKHLVTM 
TLDYGMDOVSWHLHLLWG*TLPPAOGKTGA 

SEDKVSVRRGFRGCMQVRGGCGGRGEACPS 
QAAPRL 


40 


1390 


A 


801 


69 


399 


IHKIHHKJEDLNKWKYILCSGMERLSTVMIPVV 
PQHYKPNA*Q\VILKFTW*E*GAKITILJvKNKL 
RGL VLVPLSTC* VKYLLDKVLPHKTY YE AR 
VNKSVVLVQVTIM 


41 


1391 


A 


835 


7 


195 


SMLKERKVFQFPSCLFFQYITWLGPPYHVLFD 
SS V1T4FS IGAK* DILOS VMNCLY AKRIPCVT 


42 


1392 


A 


841 




415 


GSTHA SG YDKTPDFILQVPVAVEGHIIHWIES 
KASFGDECSHHAYLHDQFWSYWNSLKHRTW 
QGIGTVASNL SQL*TLN APFPELLLFRSLARTG 
FVLT*\RFGPGLVIYWYGFIQELDCNRERGILL 
KACFPTNIVTL 


43 


1393 


A 


845 


358 


92 


PALSPAPVPQKKGSPLPLDPCLGPSSWLLSVG 

LGWPRL*PRRGPGDPGSLPATPPLLTPPHTLLP 

ORPMLPPSHAGLARPPPPEPISVP 


44 


1394 


A 


853 


452 


1 


LPQYCFFPRLSPKSKLVKHSAL* *PSALKPPTK 

SPRCIPRTSLYFTICC/PPALQL/SPIEDPPAIYRS 

PPTHMLRSASQPLNQAPTLVKGHPPSRFLQG 

QVSCPPQPTLPREKPLPLHLRPPPRPAQPPLPR 

PLTFSTRRNVDPEIPERFR 


45 


1395 


A 


894 


379 


162 


GVYPPTVFDNYSVQTSVDGQIVSLNTWDTAG 
QEEYD/RLRTLS*PQTSIFVICFSIGNLEFPIYGT 
WLSMSMGK 


46 


1396 


A 


900 


1 


366 


TTKXTLISNNVSSRSLP1LPELKAFSLAFNDPL 
EIQK YMRT/DQ* CVTHDISLYI VTKLALIFLIPR 
VFLFHQLNIT* * CLHFFTMTTFI AIPFSFLFLGR 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



Met 
hod 



47 



48 



49 



50 



51 



SEQ 
ID NO: 
in 

USSN 

09/496 

914 



1397 



1398 



1399 



1400 



1401 



944 



963 



967 



Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 



162 



216 



466 



973 



45 



992 



2095 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



308 



421 



194 



52 



53 



54 



55 



56 



1402 



1403 



1404 



1405 



1406 



994 



1011 



1016 



1033 



462 



630 



222 



1044 



366 



429 



Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGlycine, H=Histidine : 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proiine, 
Q-Glutamine, R=Arginine, S=Serine, 
T-Threonine, V= Valine, W=Tryptophan, 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 



D/KSLAMLPRLVSNSWPQVILPP 



QLQNLASRGCL* SQLLRRLRRENRLN PUG GG 
CSEIAPVCTPAWVTQRDFFRKKK 



HFTPDR1AIVKNTRDSHC WRGC* EEG AP ARC 

PRKRESWWGERLP/PRGFPPAAEDAPAPGWK 

GRKHASRTARAHVFHPIRQSIRSPVRGRPGDP 

RAAHTRSAGTRLQCKASRGG*GKGPAPTR*E 

GGPGSAPAPLPASSGCSLFPDSSPWTPPPPAPG 

AAAAQP* *TPRCPAALRAGAHIGRVGRP Y 

EKCIQALDVFVFCYIDHSSHCLMSCD^E/UQA 

LNFMPLEMEPKMSKLAFGCQRSSTSDDDSGC 

ALEEYAWVPPGLRPEQIQLYFACLPEEKVPY 

VNSPGEKHRIKQLLYQLPPHDNEVRY CQSLSE 



IRIRHEAARSCLGCAAGHVPAPGLRLLPTVRG 

PPGRRGPAAPGCVCY* SGESTF VSH VPQRMA 

WPGSAPPRGFHPLQSQTSPSDTVSSPQLSKEE 

DGPGWEHPLSSSL*SLGQAGGNH+QPEELAG 

WEPRGPPSLAPSSPT/TMWTALVLIWIFSLSLS 

ESHAASNDPRNWPNKMWKGLVKRNASVET 

VDNKTSEDVTMAAASPVTLTKGTSAAHLNS 

MEVTTEDTSRTDVSEPATSGVAADGVTSIAPT 

AV AS STTAASITTAAS SMTVASS APTT AAS ST 

TVASIAPTTAASSMTAASSTPMTLALPAPTST 

STGRTPSTTATGHPSLSTALAQVPKSSALPRT 

ATLATLATRAQTVATTANTSSPMSTRPSPSKH 

MPSDTAASPVPPMRPQAQGPISQVSVDQPVV 

NTI~N KSTPMPSNTTP EP APTPTV VTTTKAQ AR 

EPTASPVPVPHTSPIPEMEAMSPTTQPSPMPYT 

QRAAGPGTSQAPEQVETEATPGTOSTGPTPRS 

S GGTKMP ATDSCQPSTQGQYMV/DHH * APHP 

GRGRQNSPSGGAVTRGDPFHHSLGFVCPAGL 

* ELQEEGLHPGGLLNQRDVCGLRNVRG AGA 

WREAWPLPRPFLLPLRPNQVLPNSFGA1EEIC 

QMLKHI 



"esgeflvsftlkkptnvfhhingmkffnk/lif 
* shtdiafykiqhpfmlkaltkw a* egt*pdr 
rylh* slrlkgeqlktfplrsgmr*g/cailpl 
vlnamlsivpavvpagktrhekeitcpligqe 

EK*FS*FVGDMNTCVENKKESKKLLE 



PEVIQQSAYDSKADIWSLGITAIELAKGEPPNS 

DMIlPMRVLFLIPKNNPPniCWRRLLESFKEV 

* LMLA* TKDPSI\RPTAKELLKHKFI VKN SKKT 

SYLTELIDRFKJIWKAEGHSDDESDSEGSDSES 

TSRENNTHPEWSFTTVRKKPDPKKVQNGAEQ 

DLVQTLSCLSMIITPAFAELKQQDENNASRNQ 

AIEELEKSIAVAEAAGPG 



ISIDA*KAFDKIQH/CFMITTLK1CLGIDGKYLN 
Tl KAIDDRHTV STILN VEKLKAFL * RSGTRQRF 
PISGSGARI 



HASVDGDEGSDDVYYYYTPAILRELQALNTA 

EAAEHRPEEDRMLSEDPWRPAHMIKG YMPL 

HNIPHTEVIDVTGLNQSHLYQHLNKGTPMKT 

QKRAAVLYTWHVLEQLEILRQINQQSHGPG 

SVLTLQTRSPSKPLSXRKLMDWEVVSRNSISE 

DRLETQSRASRSPPVTPNQSQETPVDGKPLAL 

PPNQSQKNIRYHIHYLHLQYYLDRH1SATLPIP 

SSSGIPTPIAVITDALTDLVELILGQPCSEESGR 

APGTLFLLAL 
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seq- 
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SEQ ID 

peptidc 
seq- 
uence 


Met 
noa 


SEQ 

ID !MU. 

in 

USSN 

09/496 

914 
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nucleotide 
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ng to first 

amino aClu 

residue of 

JJCJJUUC 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, EXjlutamic Acid, 
F=Phenylalanine, GKMycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M=Methionine, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serme, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unkno\vn, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


57 


1407 


A 


1050 


11 


430 


GAYAFETNGFPIMLVLTTDKIEGDVGIAGLYD 
MHVISLPMAFLLRTLVRCTSYIIPVTHVLSTPV 
TCLRRREKDGVIVDVLSDTASNHNGFPVEEH 
AUDI HPARLQGPTLRSQPMGPLKHKAFEERA 
NLGLVQRRLRLED 


58 


1408 


A 


1058 


258 


419 


LKHRDTPVVGANNRALSCTPLTSLTLCALCPL" 
FLLGCPTXATCRLYQTTVAVVF 


59 


1409 


A 


1064 


3 


425 


KAFSFTTSLIGHQRMHTGERPYKCKECGKTF ~~ 

KGSSSLNNHQRIHTGEKPYKCNECGRAFSQC 

SSLIQHHRIHTGEKPYECTQCGKAFTSISRLSR 

HHRIHTGEKPFHCNECGKVFSYHSALIIHQRIH 

TGEKPYACKDVGK 


60 


1410 


A 


1065 


204 


419 


GGPPGPFLAHTHAGLQAPGPLLAPAGDEGDL 
LLLAVQQSCLADHLLTAS WG GK/DPIPTKALG 
EGQEGLPLTV 


61 


1411 


A 


1079 


3 


383 


RHSRAHLCQPFHLVMRDLLQLGQDIPQGCHY 

LEENHLIHRDIAARNCLLSCAAPTRAATIGDF 

GMARYIYRTRYYQLGDRAL/LPRKWMPPEAL 

LEGIFTYNTDSWTFGVLLWEIFSLGYMPYPGR 

TN 


62 


1412 


A 


1080 


1 


859 


VVEFLWSRRPSGSSDPRPRRPASKCQMMEER 

ANLMHMMKLSIKVLLQSALSLGRSLDADHA 

PLQQFFVVMEHCLKHGLKVKKSFIGQNKSFF 

GPLELVEKLCPEASDIATSVRNLPELKTAVGR 

GRAWLYLALMQKKLADYLKVLIDNKHLLSE 

FYEPEALMMEEEGMVIVGLLVGLNVLDANU 

CLKGEDLDSQVGVIDFSLYLKDVQDLDGGKE 

HERITDVLDQKNYVEELNRHLSCTVGDLQTK 

IDGLEKTNSKLQERVSAATDRICSLQEEQQQL 

REQNELIR 


63 


1413 


A 


1083 


2 


615 


SSFAKHKJIIHTGEKPFICLECGKAFTSSTTLTK 

HRRfHTGEKPYTCEECGKAFRQSAILYVHRRI 

HTGEKPYTCGECGKTFRQSANLYAHKKIHTG 

EKPYTCGDCGKTFRQSANLYAHKKIHTGVEKP 

YKCKECGKAFKS YY SILKHKRTHTRGM S YEG 

DEC/QRSLN/RSSILSNHKIIHNEEK/PLKCEKCE 

KAFNHTSICCRHKKN 


64 


1414 


A 


1084 


946 


1 


KKQDL SSSLTDDSKNAQAPLALTESHL ATL A 

SSSQSPEA1KQLLDSGLPSLLVRSLASFCFSHIS 

SSESIAQSIDISQDKLRRHHVPQQCNKMPITAD 

LVAPILRFLTEVGNSHIMKDWLGGSEVNPLW 

TALLFLLCHSGSTSGS\HNLG\AQQDQCKISFS 

FFSWLTTGLTTQQRTAIE\NATVAFF\LQCI\SC 

HPNNQKLMAQVLCELFQTSPQRGNLPTSGNI 

S\GFIR\RLFLQLMLEDEKVTMFLQSPCPLYKG 

RINATSHVIQHP\MYGAGHKFRTLHLPVSTTL 

SDVLDRVSDTPSITAKLISKQKDDKKKK 


65 




A 
A. 


lUo/ 




324 


PRAFEFVI1TEMIVG/RVQNIHLFTLQVLEDRA 

LFTMSVGSSLWSTYLIHVMALP/DRELLKPNA 

SVALHKLSNALV 


66 


1416 


A 


1095 


3 


493 


TJW'rpcVTHIV^lF^l PPT XTPCl-lPAQXDr , tJ'rt7KTT:r* 
rxct i v_-o v i ni v or oLrr JLiNrorir Ao 1 rUxl I liNfciJ 

PSLVWFDRGKFYLTFEGSSRGPSPLTMGAQD 

TLPVAAAK1ETVNAYFKGADPSKCIVKITGE 

MVLSFPAGITRHFANNPSPAALTFRVINFSRLE 

HVLPNPQLLCCDNTQNDANTKVEFWVNMPNL 
MTHLK 


67 


1417 


A 


1098 


57 


356 


LKLTSLGFIIGVSWGNLLISILLVKDKTLHRA 

PYYFLLDLCCSDILRSAICFPFVFNSVKNGST 

WTYGTLTCKVIAFLGVLSCFHTAFMLFCISVT 
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D=Aspartic Acid, EK}Iutamic Acid, 
F=Phenylalanine, OGlycine, H=Histidine, 
I=Isoleucine s K=Lysine, L-Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 














RYL 


68 


1418 


A 


1106 


1 

* 


1326 


MGKISATGINMGTKCSWALVWHLESYDPKH 

YEREGMQDWKTASGQSEEATQQSSQKPQPH 

YTTYQSSSFLKYSSESHLLAWRENSSEGSFQF 

PGRSRARPPRTRQQRRGAAAGPGRGAVRLG 

HPQS AAQPQLRAAARIPESP AAFPAQPRPG S A 

RNSDASGPASLSRTLGRASSPRPPQAPDVTAP 

SPAALAPRAARGGSRAAALAGAEAEEPLRTL 

APRPTRAAAPPPPPPPPPLPPGAPPPPVRCVSR 

RARAPPWR/PAATGPPPVRPVAPSRKLGSARAP 

APALQIRKGTS SGLPGRGGG SGPGNNLS SVA 

GNWRGSSFAVERPGMAKYQGEVQSLKLDDD 

SVIEGVSDQVLVAVVVSFALIATLVYALFRNV 

HQNIHPENQELVRVLREQLQTEQDAPAATRQ 

QFYTDMYCPICLHQASFPVETNCXjHLFCGSLT 

PNSIW 


69 


1419 


A 


1107 


2 


466 


FDTARLHEFGTSITQIFAVDNREDLQKWMEA 

FWQHFFDLSQWKHCCEE1AIKIEIMSPRKPPLF 

LTKEATSVYHDMSIDSPMKLESLTDIIQKKIEE 

TNGQFLIGQREESLP/SS/CGPHSLMVTIKWSS 

RKRY/SYPASEPLHDEKGKKRQAPLPPSDK 


70 


1420 


A 


1111 


698 


23 


ALRRLHYVKATKWLSFRRPFWREEHIEGGH 

SNTDRPSRMFYPPPREGALLLASYTWSDAAA 

AFAGLSREEALRLALDDVAALHGPVVRQLW 

DGTGWKRWAEDQHSQGGFWQPPALWQT 

EKDDWTVPYGRIYFAGEHTAYPHGWVETAV 

KSALRAAIKINSRKGPASDTASPEGHASDMEG 

QGHVHGVASSPSHDLAKEEGSHPPVQGQLSL 

QNTTHTRTSH 


71 


1421 


A 


1119 


2 


385 


QKQTLQNGYLDSSMDILYLGSLPPELQVSSDE 
PPGPPEQAGLSQFI1LEPETQNPETTEEIQSS\LQ 
QEAAAQLPQLPEWELSSTKA\EAPALPSQSL 
EG VHS STEQKAPAQQLPAFEEIL APLLIHHE 


72 


1422 


A 


1127 


1 


906 


HAQYVGPYRLEKTLGKGQTGLVKLGVHCIT 

GQKVAIKIVNREKLSESVLMKVER£IAIL\RLI 

EHPHVLKLHG VYENKK YFPPDELTS GPSMLA 

QVSPHGKLSARRSWDLLSGFPRYLVLEHVSG 

GELFDYLVKKGRLTPKEARKFFRQIVSALDFC 

HSYSICHRDLKPENLLLDEKNN^UADFG^4AS 

LQVGDSLLETSCGSPHYACPEVIKGEKYDGR 

RADMWSCGVILFALLVGALPFDDDNLRQLLE 

KVKRGVFHMPI IFIPPDCQSLLRGMIEVEPEKR 

LSLEQIQKHPWYLGGNFIS 


73 


1423 


A 


1128 


1 


802 


LRNALDVLHREVPRVLVNLVDFLNPTIMRQV 

FLGNPDKCP VQQA/MLEPLG SKTETLDLRAE 

MPITCPTQNEPFLRTPRNSNYTYPIKPAIENWG 

SDFLCTEWKASNSVPTSVHQLRPADIKVVAA 

LGDSLTTAVGARPNNSSDLPTSWRGLSWSIG 

GDGNLETHTTLPNILKKFNP YLLGF STSTWEG 

TAGLNVAAEGARARDMPAQAWDLVERMKN 

SPDINLEKDWKLVTLFIGGNDLCHYCENPEA 

HLATEYVQHIQQALDILSE 


74 


1424 


A 


1139 


60 


480 


FREPCLLVPGDHQPLREASWLA/LPPIGLWGT 

DSPLCCVEVAIPCNKGAHSVGLKGWLLAQG 

VLGMRDTIPQEHPWESTPDLCFCRDPEEIEVE 

EQPAADAAVAKGEF/QGEQIAPVPAMIAAHPE 

AADPAPVHTTAHPKGA 


75 


1425 


A 


1147 


2 


413 


PFPHQHPQEPNKGSCWPQSALRGQCPGPVLGV 
TTTSDLCSLQVPVSSHRNPLLDLAAYDQEGR 
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/^possible nucleotide deletion, \=possible 
nucleotide insertion 














RFDNFSSLS1QWESTRPVLASIEPELPMQLVSQ " 

DDESGQKKLHGLQAILVHEASGTTAITATAT 

GYQESHLSSAR 


76 


1426 


A 


1155 


38 


410 


PIISAPAQDDPILLSFIHCLHANLLCVWRRDVK™ 

PDCKEIWIFWWGDEPNLVWQYIMNCMT.WK 

KJ3SGKMAFPMNVGRC/FFKEIHNLLERCLMD 

KNFVL1GKWFVRPYYKDEKPVNKSEHLSCAF 
T 


77 


1427 


A 


1162 


526 


350 


RFPQGLEDVSTYPVLIEELLSRGWSEEELQGV 
LRGNLLRVFRQVEK VQEENKWQ SPLED 


78 


1428 


A 


1171 


I 


1293 


MAESASPPSS SAAAPAAEPGVTTEQPGPRSPP 

SSPPGLEEPLDGADPHVPHPDLAPIAFFCLRQT 

TSPRNWClfCMVCNPWFECVSMLVILLNCVTL 

GMYQPCDDMDCLSDRCKILQVFDDFIFIFFA 

MEMVLKMVALGIFGKKCYLGDTWNRLDFFI 

VMAGMVEYSLDLONINI SAIRTVRVT RPT KA 

1NRVPSMRILVNLLLDTLPMLGNVLLLCFFVF 

FIFGIIGVQLWAGLLRNRCFLEENFTIQGDVAL 

PP\YYQPEEDDEMPFICSLSGDNGIMGCHEIPP 

LKEQGRECCLSKDDVYDFGAERQDLNASGL 

CWWNRYYNVCRTGSANPHKOATNFDNTfiY 

AWIVIFQVITLEGWVEIMYYVMDAHSFYNFI 
YF1LLIIVSVREPGLLGGSFSTAOSPKCOGDSFP 
G VAAESLLLRG W VLWLPGGG 


79 


1429 


A 


1175 


1 


405 


PNDFFKDMFPDLPGGPLGP1KAENDYGAYLN 

FLSATHLGGLFPPWPLVEERBCLKPKASQQCP[ 

CHKVIMGAGKLPRHMRTHTGEKPYMCTICE 

VRFTRQDKLKIHMRKHTGERPYLCIHCNAKF 

VHNYDLKNHMR 


80 


1430 


A 


1182 


25 


198 


EMNELSQQLSQQGGRGASQCPSPPAPTLPNPT 
PLCQLQLQRVNTGLPTPPCHPGAGAA 


81 


1431 


A 


1186 


254 


583 


KTVLDVGAGTGILSl FCAQAGARR VYA VEAS 
AIWQQAREVVRFNGLEDRVHVLPGPVETVEL 
PEQVDAIVSEWMGYGLLHESMLSSVLHARTK 
VVKDGGFFLPXSSELFM 


82 


1432 


A 


1187 


2 


716 


DFVDAARNLPLESTKSPAEPSKSVPSLENDPRA 

SSQGLPSQGPVQNQGRRGEQRPKKF/TVIQHT 

SSFEKSDSLEQPSGLEGEDKPLAQFPSPPPAPH 

GRSAHSLQPKLVRQPNIOVPEILVTEEPDRPD 

TEPEPPPKEPEKTEEFQWPQGSQTLAQFPVEK 

LPPKKKRLGLAKMAQSSGES SFESS VPLFRSP 

SQESNVSLSGSSRSALFERDDHGKAEAPSPSF 

DMGPKPLGTHMLTV 


83 


1433 


A 


1188 


517 


804 


ESPGLSKVLRTGAFAYPFLFDNLPLFYRLGLC 
WGRGHGCGQEALSTSHGYHLFCALLTGFLFA 
SHLPERLAPGRFDYIGHSHQLFHICAVLGTHF 
Q 


84 


1434 


A 


1192 


45 


476 


LGDVGFWVERTPVHEAAORGESLOLOOI rFS 

GACVNQVTVDSITPLHAASLQGQARCVQLLL 

AAGAQVDARN1DGSTPLCECLRLGQHRVCEA 

LAVLRGQGQPSPVHSVPPARGLHXREFRMC* 

GFLFDVGXNLEAHEFHFGEP 


85 


1435 


A 


1194 


69 

i 


410 


KRSEEASAPPFPLGGTGAAPTRASLPEQILLPR 
SCLEARKSQPDEKLLSALHNSRTWN*EPRRSQ 
HRLVSPEVHPGRRGSSPGVAECKLTSAYFRT 
GRSPCPSLPGTTRTOSLL 


86 


1436 


A 


1215 


3 


405 


LPSHTCGNPGRLPNG1QQGSTFNLGDKVRYSC 

NLGFFLEGHAVLTCHAGSENSATWDFPLPSC 

RADDACGGTLRG/AEWHHLQPPLPLG/ATKN 
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NADCTWTILAELGDT1ALVFIDFQLEDGYDFL 
EVTGTEGSSLW 


87 


1437 


A 


1216 


226 


964 


GTARFGPMVGFGANRRAGRLPSLVLGVLLV 

VIVVLAFNYWSISSRHVLLQEEVAELQGQVQ 

RTE VARGRLEKRNSDLFA WG HA QETDRPEG 

GRLRPPQQPAAGQRGPREEMXEDDKVKLQNN 

ISYQMADIHHLKEQLAELRQEFLRQEDQLQD 

YRKNNTYLVKRJLEYESFQCGQQMKELRAQH 

EENIKKLADQFLEEQKQETQKIQSNDGKELDI 

NNQVVPKNIPKVAENVADKNEEPSSNHIPHG 


88 


1438 


A 


1218 


1 


534 


PEFGTTISCGYLMATDVSRRPSVHKAVEIEQE 

RVKSAGAW1IHPYSDFRFYWDLIMLLLMVGN 

LIVLPVGITFFKEENSPNPWIVFNVLSDTFFLLD 

LVLNFRTGIVVEEGAEILLAPRAIRTRYLRTW 

FLVDLI SSIPVDYIFL VVELEPRLD AEVYKTAR 

ALRIVRFTKILSLLRL 


89 


1439 


A 


1223 


1 


743 


MGFDEVFMINLRRRQDRRERMLRALQAQEIE 

CRLVEAVDGKVGMLTRSNAAPGRHLAMLET 

LVVVAPRFVDADNLILNPDTLSLLIAENKTVV 

APMLDSRAAYSNFWCGMTSQGYYKRTPAYI 

P1RKRDRRGCFAVPMVHSTFLIDLRKAASRNL 

VAFYPPHPDYTWSFDDIIVFAFSCKQXAEVQMY 

VCNKEEYGFLPVPLRAHSTLQDEAESFMHVQ 

LEVMVPSSPSSAQSMAVVSADH1GLVISYL 


90 


1440 


A 


1227 


2 


349 


NKTSFIFYLKNIWADLIMTLTFPFRIVHDAGF 
GPWDFKFILCRYT SVLFYANMDTSI WL GLIT/ 
YDRY/WKVVRHL/WDSWMTGI/SFTRVYLLG 
LGARLVWFGKLILAKGGHGGISWL 


91 


1441 


A 


1245 


3 


1937 


LGSSDVRAPQRSELGAESPSRMVASQAYNLT 

SALTPILTRSRVLNEEPLTLAGF^SRAPANLSD 

WQLIFLVDSNPFPFG YI SNYTVSTKV ASMAF 

QTQAGAQIPIERLASERAITVKVPNNSDWAAR 

GHRSSANSVWQPQAFVGAVVTLDSSNPAAV 

LHLQLNYTLLDGRYLSEEPEPYLAVYLHSEPR 

PNEHNCSASRR1RPESLQGADHRPYTFFISPGT 

RDPVGSYRLNLSSHFRWSALEVSVGLYTSLC 

QYFSEEDVVWRTEGLLPLEETSPRQAVCLTR 

HLTAFGTSLFVPPSHIRFVFPEPTADVNYIVML 

TCAVCLVTYMVMAAILHKLDQLDASRGRAIP 

FCGQRGRFKYEILVKTGWGRGSGTTAHVGIM 

LYGVDSRSGHRHLDGDRAFHRNSLDIFQIATP 

HSLGSMWKIRVWHDNICGLSPAWFLQHnVRD 

LQTARSTFFLVNDWLSVETEANGGLVEKEVL 

AASKASFRVPTPSVAALLRFRRLLVAELQRGF 

FDK.HIWLSIWDRPPRSCFTR1QRATCCVLLICL 

FLGANAVWYGAVGDSAYSTGRVSRLNPLSV 

DTVAVGLVSSVWYPVYLAILFLFRMSRSICV 

GWGWGPGSTGNGAWASAPCPEPPLSSAAAR 

GKGVHQRLLGKGQHT 


92 


1442 


A 


1246 


5 


562 


VFDEENILNELNDPLREEIVNFNCRKLVATMP 

LFANADPNFVTAMLSKJLR1-EVFQPGDY1IREG 

AVGKKMYFIQHGVAGVITKSSKEMKLTDGS 

YFGEICLLTKGRRTASVRADTYCRLYSLSVD 

NFNEVLEEYPMMRRAFETV AI DRLDRIGKJCN 

SILLQKFQKDLNTGVFNNQENE1LKQIVKH 


93 


1443 


A 


1249 


180 


901 


TVPPPPGGPSPAPLHPKRSPTSTGEAELKEERL 
PGRKASCSTAGSGSRGLPP\SSPMVSSAHNPN 
KAEIPERRKDSTSTPNNLPPSMMTRKNTYVCT 
ERPGAERPSLLPNGKENSSGTPRVPPASPSSHS 
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D=Aspartic Acid, E=Glutamic Acid, 
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LAPPS GERSRL ARGSTIRSTFHGGQVRDRRA G 
GGGGGG VQN GPPASPTLAHEAAPLPAGRPRP 
TTNL FTKXTSKLTRRV ADEPERIGGPE VTRRT 
RQEDHLSPGGRGCSEL 


94 


1444 


A 


1261 


3 


385 


KFSQWGLTKPKLSNASP/W1SLVKKLMKKWS 

VTONLTFREOLEAGIRYFDLRVSSKPGDADO 

EIYF1HGLFG1KVWDGLMEIDSFLTQHPQEIIFL 

DFNHFYAMDETHHKCLVLRJQEAFGNKLCPA 

CR 


95 


1445 


A 


1282 


2 


550 


GPRDNPG\EDPRFEIVEHFGIAWFTFELVARFA 

VAPDFLKFFKNALNUDLMS1VPFYITLVVNL 

VVESTPTLANLGRVAQVLRLMRIFRILKLARH 

STGLRSLG ATLKYS YKEVGL1XL YLS V G1S1FS 

WAYTIEKEEN\EGLATIPACWWWATVSM7T 

VGYGDWPGTTAGKLTASAC1LA 


96 


1446 


A 


1294 


1 


1456 


QLLPPSNRENAGLLVGRCLCSAALRPVGDLIT 

SSGQVAVRNAPQAGSAKAGKGKFQDNFEFIQ 

YFKKFFDANCNEKDYNPVAAGQGQETEVAP 

SIVAPVLNKPNQCPEGYICVKAGRNPNYGYT 

SFDTFSWAFLSLFRLMTQDYWENLYQLTLRA 

AETTYMIF/LV/LVILLGSLYLVTLILAVA^AMA 

EAAQQAATATASEHSREPSAAGRLSDSSSEAS 

KLSSKSAKERRNRRKKRKQKEQSGGEEKDED 

EFQKSESEDSIRRKGFRFSIEGNRLTYEKRYSS 

PHQSLLSIRGSLFSPRRNSRTSLFSFRGRAKDV 

GSENDFADDEHSTFEDNESRRDSLFVPRRHGE 

RRNSNLSQTSRSSRMLAVFPANGKMHSTVDC 

NGWSLVGGPSVPTSPVGQLLPEVIIDKPATD 

DNGTTTETEMRKRRSSSFHVSMDFLEDPSQR 

QRAMSIASILTNTVE 


97 


1447 


A 


1295 


2 


2057 


1QTQLPTKSSQQLRKGGNCVRCKMQMNF1AE 

EVLLKYRlTFyKNNKGPNMLYIEIKAFVHFMI 

NRYLSYGSGPKRFPLV DVLQY ALEF ASSKPV 

CTSPVDDIDASSPPSGSIPSQTLPSTTEQQGALS 

SELPSTSPSSVAA1SSRSV1HKPFTQSR1PPDLP 

MHPAPRH1TEEELSVLESCLHRWRTEIENDTR 

DLQES1SRIHRTIELMY SDKSMIQVP YRLHAV 

LVHEGQANAGHYWAYIFDHRESRWMKYND1 

AVTKS SWEELVRDSFGG YRKASAYCLMYIN 

DKAQFLIQENDLIKTG QPLVGIETLPPDLRDFV 

EEDNQRFEKELEEWDAQLAQKALQEKLLAS 

QKLRESETSVTTAQAAGDPKYLEQPSRSDFSK 
HI KEFTIOTTTKASHFHFDKSPFTVT O^ATKT F 

Y ARL VKLAQEDTPPETD YRLHH V VVYFI QN Q 

APKKIIEKTLLEQFGDRNLSFDERCHNIMKVA 

QAKLEMIiCPEEVNLEEYEEWHQDYRKFRETT 

MYLIIGLENFORESYIDSLLFLICAYONNKELI 

SKGLYRGHDEEUSHYRRECLLKLNEQAAELF 

ESGEDREVNNGLIIMNEFIVPFLPLLLVDEMEE 

KDILAVEDMRNRWCSYLGQEMEPHLQEKLT 

DFLPKJXDCSMEIKSFHEPPKLPSYSTHELCER 

FARIMLSLSRTPADGR 


98 


1448 


A 


1304 


118 


453 


SGPSSRAIYLHRKEYSQNLTSEPTLLQHRVEH 
LMTCKQGSQRVQGPEDALQKLFEMDAHGRV 
WSQDL1LQVRDGWLQLLDIETKEELDSYRLD 
SIQAMNVALNTCSYNSILS 


99 


1449 


A 


1306 


3 


1660 


CGYFCHTTCAPQAPPCPVPPDLLRTALGVHPE 
TGTGTAYEGFLSVPRPSGVRRGWQRVFAALS 
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D=Aspartic Acid. E=Ghitamic Acid, 
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DSRLLLFDAPDLRLSPPSGALLQVLDLRDPQF 

SATPVLASDVIHAQSRDLPRIFRVTTSQLAVPP 

TTCTVLLLAESEGERERWLQVLGELQRLLLD 

ARPRPRPVYTLKEAYDNGLPLLPHTLCAAILD 

QDRLALGTEEGLFVIHLRSNDIFQVGECRRVQ 

QLTLSPSAGLLVVLCGRGPSVRLFALAELENI 

EWEVPKIPESRGCQVLAAGSILQARTPVLCVA 

VKRQVLCYQLGPGPGPWQRRIRELQAPATVQ 

SLGLLGDRLCVGAAGGFALYPLLNEAAPLAL 

GAGLVPEELPPSRGGLGEALGAVELSLSEFLL 

LFTTAGIYVDGAGRKSRGHELLWPAAPMGW 

GYAAPYLTVFSENSIDVFDVRRAEWVQTVPL 

JCKA V RPLNPEG SLFL Y GTEKVRLT YLRN QL A E 

KDEFDIPDLTDNSRRQLFRTKSKRRFFFRVSE 

EQQKQQRKEMLKDPFVRSKLISPPTNFNHLV 

HVGPAN GRPG ARDKSP 


100 


1450 


A 


1318 


918 


190 


SLCVPGPVDTGTFAVMSVMVGSVTESLAPQA 

LNDSMINETARDvVARVQVASTLSVLVGLFQV 

GLGLIHFGFVVTYLSEPLVRGYTTAAAVQVF 

VSQLKYVFGLHLSSHSGPLSLIYTVLEVCWKL 

PQSKVGTVVTAAVAGVVLWVKLLNDKLQQ 

QLPMPIPGELLTLIGATGISYGMGLKHRFEAGN 

PP VAPNTQLFSKL VGS AFTIAV VGF AIAI SLGK 

IFALRHGYRVDSNQVWVMRDV 


101 


1451 


A 


1353 


220 


445 


DWPDLFTYPLIGSPKCFQSARPE\RMYRRTVR 
SSHGNHALQEVLPRSGHGTEFTKQKHLEAAD 
HGHPPARMSIFSR 


102 


1452 


A 


1363 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFL1HYYASGEN 

WEFGDFMCKFIRFSFHFNLYSSILFLTCFSIFRY 

CVIIHPMSCFSIHKTRCAWACAVVWIISLVA 

VIPMTFLITSTNRTNRSACLDLTSSDELNTJKW 

YNLILTAVLLCLPLVIVTLCYTTIIHTLTHGHAN 

\DSCLKQKARRLTILLL 


103 


1453 


A 


1371 


2 


410 


CHSTESSSDFILPGDYLLGGLCPLHSGCLQVXC 

SFNEHGYHLFQAMRLAVEEINNSTALLPNITL 

GYQLYDVCSDSANVYATLRVLSLPGQHHIEL 

QGDLLHYSPTVLAVIGPDSTNRAATTAALLSP 

FLVPMLLEQ 


104 


1454 


A 


1376 


3 


432 


NSRVEDRS/NMSLWTQNITVCPVKNVTRDGG 
FGPWSPWQPCEHLDGDNSGSCLCRARSCDSP 
RPRCGGLDCLGPAIHIANCSRNGAWTPWSSW 
ALCSTSCGIGFQVRQRSCSNPAPRHGGR1CVG 
KSREERFCNENTPCPVPIF 


105 


1455 


A 


1379 


2 


396 


GLGLLYLBFAAVEGVMRVIGGSNHLAVVLDD 

IILAVIDSIFVWFIFISLAQTMKTLRLRKNTVKJF 

SLYRHFKNTLIFAVLASIVFMGWTTKTFR1AK 

CQSDWMERWVDDAFWSFLFASLILIVIMFLW 

RPSA 


106 


1456 


A 


1383 


1 


432 


EDGHGGWSSRCLVDHAEEGHREPWKRLCIW 
QRGGHEIRFAFYFPGHPLLSPQICLAPETPPRG 
CPPVSSLHFISLQ/RLPRDCQELFQVGERQSGL 
FEIQPQGSPPFLVNCKMTSGTFWTCRTDSRVF 
QNANPSNAAHSEDQPTP 


107 


1457 


A 


1386 


719 


558 


FFFVTRSHSVAQAECSGVFTAHRSLDLVGSSN 
YPALSLQSSWDHRHTWLTFAFL 


108 


1458 


A 


1397 


631 


2 


RVA1SLLCAAIFISFMVQSAGKRWPTGVMLM 
VVVLFAFLYSWPIQALLPTYLKTDLAYNPHT 
VANVLSFSGFGAAVGCCV/GGFLGDWLGTRK 
AYVCSLLASQLLIIPVFAIGGANVWVLGLLLF 
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FQQMLGQGIAGILPKLIGGYFDTDQRAAGLG 
FTYNVGALGGALAPIIGALIAQRLDLGTALAS 
LSFSLTFVVILRNRRPGKSLVR 


109 


1459 


A 


1402 


15 


387 


VLVALPDTWTSETVVTEVLGHRVTLPCLYSS 
WSHNSNSM C WGKDQCPYSGCKEALIRTDGM 
RVTSRKSAKYRLQGTIPRGDVSLTILNPSESDS 
GVYCCRIEVPGWFNDVK1NYRLNLQRASTT 


110 


1460 


A 


1421 


3 


350 


HEDLSSLLTRGSGNQERERQLKKLISLRDWM 
LAELAFPVG VLATCA* SLLSC* YCVILFPCSCF 
FFHSPDALFSLLLLSC YFPSYCFFYYLFFS SSPL 
CLLLASSPFPLFILLASL 


111 


1461 


A 


1426 


2 


344 


FTSTMTKPFEKESEQPA*ATLAFGAQTSTTAD 
QCALKPDLSYLNNSSSSSSTPATSAGGGIFGSS 
TSSSNPPVATFVFGOSSDPVSSYGFVNTAF<?<5T 
SDSLLFSQDSKXATTS 


112 


1462 


A 


1434 


46 


372 


TTSWTTSCTRSCT* SGASSGPG W1PRTTWWR 
SRRSSQRTCSRACSGAWSRTW*RSS*TSSSSC 
STSCSSSSSRSCGRPGGPLGARGVHITSCLNSC 
MSSSTTSSTTSTF 


113 


1463 


A 


1439 


3 


292 


HEDIMTHYDRLVDE*ALNAGKQRYEKMISG 
MYLGEIVRNIUDFTKKGFLLRGQISEMLKTR 
GIFLTFLLSNFLIVCVLLFYVSFYLFQSCINFVL 


114 


1464 


A 


1463 


1 


396 


KOOAVPEPHSSTTTPOFOFONWYnOni iwin 

QRTKVHLPGHKTGPAVAKDTPEPVKKEFTVP 

ATSQGP*SPFSEEPPLPPSNEEVPPTLPP*EPQS 

EDP*KNA*LKQMHAATTHWQQHQQHQVGC 

QYHGIMO 


115 


1465 


A 


1464 


291 


2 


AGSYPSMVWSCHWGVTQKRRAL*VYSFEEG 
GRRKCGQYWPLEKDSRIRFGFLTVSNLGVEN 
MNHYKKSTLEILNPEVNPGFFFLTLWKQGEN 
NYCN 


116 


1466 


A 


1465 


667 


337 


LPPORPA*TDSYSTCNVSSfiFl AGOSHNIHT O 
YWTKYQVWEWLQHFLDTNQLDANCIPFQEF 
DINGEHLCSMSLQEFTRAAGTAGQLLYSNLQ 
HLKWNGDSLFLCLSLPC 


117 


1467 


A 


1479 


1 


381 


GTSGGPKRVLVTERFPWQNPLPVNRGQAQR 

VLGPSNSFORWLOAOKLVSSHKPGONOKHK 

QLQATSVPHPVCMPLNNTQKSKQPLPSAPEN 

NPEEELASDPNNEESL*RPWALEDFEIGRPLG 

KGK 


118 


1468 


A 


1485 


3 


385 


TYLWL*GNPPFYEKNDGGLFELILRAKDEFNS 

PYWDDMSDSAKHFIRPLTGRDP*KPFPCDOPI 

QHPWIEGHTCLDNNIHQAASEPINNNFAESKR 

NLAFLATGVVRHMRKLFMGANLEGPGPTVS 

H 


119 


1469 


A 


1486 


1 


398 


GTTSKHH* LARS LIRGPFDHDLKPNAATRDQL 

NIIVSYPPTKQLTYEEQDLGWKFRYYLTNJQE 

KALTKFLKWVNWDLPQEAKQALELLGKWK 

PMDVKDSLELLSSHYTNPTVRRYAVARLRQA 

DDEDLLMYL 


120 


1470 


A 


1497 


3 


999 


MGESPAV * GYF VLAGMNSAGLSFGGGAGKY 

LAEWMVHGYPSENVWELDLKRFGALQSSRT 

FLRHRVMEVMPLMYDLKVPHWDFQTGRQL 

RTSPLYDRLDAQGARWMEKHGFERPKYFVP 

PDKDLLALEQSKTFYKPDWFDIVESEVKCCK 

EAVCVIDMSSFTEFEITSTGDQALEVLQYLFS 

NDLDVPVGHIVHTGMLNEGGGYENDCSIARL 

NKJISFFMISFTDQQVHCWAWLKKHMPKDSN 

LLLEDVTWKYTALNLIGPRAVDVLSELSYAP 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, V=possible 
nucleotide insertion 














MTPDHJf^SLFCKEMSVGYANGIRVMSMTHl' 
GEPGFMLY1P1EYRWGFTMLSTLVSNS 


121 


1471 


A 


1498 


3 


306 


AQFLLVGWDHIL+LIVL+TNLTELGRTTCDQN 

WPNSPDVLNHGCFYMQCLSKDCTIGYVSRE 

MLVAHTHTVEEHTGTHLQYVSWPDHSVPDD 

SSDFVEFEN 


122 


1472 


A 


1533 


121 


329 


LGLFSFVWTEVLEEPKDFSCETEDFKTLHCT 
WDPGTDTALGWSKQPSQSYTLFES*VGSGYII 

DNFFLA 


123 


1473 


A 


1547 


111 


408 


DARTTWKPRNGSSG1WPGDGAK*PPAVEQAE 
RGHVEMIEKLTFLNLHTSEKDKGGNTALHLA 
AKHGHSPAVQVLLAQWQDINEMNEKQQTPL 

HVAADRG 


124 


1474 


A 


1555 


1 


745 


MTFDDDDKNTYGVALVWKKFQTQSLRLSDL 

HRKSHLWRGIVSITLIEGRDLKAMDSNGLSDP 

YVKFRLGHQKYKSKIMPKTLNPQWREQFDF 

HLYEERGGVIDITAWDKDAGKRDDF1GRCQV 

DLSALSREQTHKLELQLEEGEGHLVLLVTLT 

ASATVSISDLSVNSLEDQKEREEILKRYSPLRI 

FHNLKDVGFLQVKVIRAEGLMAADVTGKSD 

PFCWELNNDRLLTHTVYKNLNPEWNKVFTL 

*VALVWKKFQTQSLRJLSDLHRKSHLWRGIVS 

ITLIEGRDLKAMDSNGLSDPYVKFRLGHQKY 

KSKIMPKTLNPQWREQFDFHLYEERGGV1DIT 

AWDKDAGKRDDRGRCQVDLSALSREQTHK 

LELQLEEGEGHLVLLVTLTASATVSISDLSVN 

SLEDQKEREEILKRYSPLR1FHNLKDVGFLQV 

KVIRAEGLMAADVTGKSDPFCWELNNDRLL 

THTVYKNLNPEWNKVFTL 


125 


1475 


A 


1556 


57 


509 


GGPAPNSRY AEP* KN SLAMT* AHADCENYV A 

CGGLDNICSIYNLKTREGNVRVSRELPGHTGY 

LSCCRFLDDSQIVTSSGDTTCALWDIETAQQT 

TTFTGHSGDVMSLSLSPDMRTFVSGACDASS 

KLWDIRDGMCRQSFTGHVSDINAVS 


126 


1476 


A 


1592 


3 


178 


KSEKSCVSSLAHFGTSCQRDYDAMVKLVETL 
EMLPTCDLADQHNIKFHYAFALNR*ER 


IZf 


I H l I 


A 


1 Ullb 


1 


497 


TESPLLVRPYLPYITKSELHAIMTAGFSTIAGS 
VLGAY1SFGVPSSHLLTASVMSAPASLAAAKL 
F WPETEKPKITLKN AMKMESGDSGNLL* AAT 
QG ASS S1SLVANIAVNLIAFLALLSFMNS ALA 
WVGNMFDYPQLSFELICSYIFMPFSFMMGVE 
WPDSFM 


128 


1478 


A 


1619 


286 


486 


CCMNSKAQESVFKNVLCNPPALSEMPDVKA 
EDEVDFRASSISEEVAVGSIAATLKMKQGPM 

TQAINR 


129 


1479 


A 


1627 


1 


395 


PTRG ALRY WIFGRFLCN1 W AA VD V RCCT ATI 

MGLCIISIDRYVGVSYPLRYPTIVTQRRGLMA 

LLCVWALSLVIY1GPLLGWRHPAPEDETICQI 

NEEPGYVLFSTPGSFYLPLAIMLVMN+RVYRV 

AKTE 


130 


1480 


A 


1638 


2 


466 


DPRVRTK1VNRKTT1YEIQDKTGSMAVVGKG 

ECHNIPCEKGDKLRLFCFRLRKRENMSKLMS 

EMHSFIQIQKNTNQRSHDSRSMALPQEQSQHP 

KPSEASTTLPESHLKTPQMPPTTPSSSSFTKVT 

KDKDIK+LLFNLYSSVEILPEVLHLKT 


131 


1481 


A 


1651 


607 


3 


LAEGGDVFDCVLNGGPLPESRAKALFRQMVE 
AIRYCHGCGVAHRDLKCENALLQGFNLKLTD 
FGFAKVLPKSHRELSQTFCGSTAYAAPEVLQ 
G1PHDSKKGDVWSMGWLYVMLCASLPFDD 
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TDIPKMLWQQQKGVSFPTOLSISADCQDLLK 
RLLEPDMILRPSIEEVS WHPWLAST* *KQWQV 
LSNKVGGESKPKKKK 


132 


1482 


A 


1656 


150 


48 


LVAKSLL YCGCLFFLLQLAKNV GNNSFND1M 
EANLTSPSPKPTPSSDM*VFLIY*TYFGAWHV 
VDAQ 


133 


1483 


A 


1660 


3 


406 


RKHIKLLIQKLSDVP* ECQNNQL* KLTEICEKE 
KKEFKKKMDDQRPEKJTEA* SKDKSPMEEEK 
TEMIRSYIQEVGRYIKRLEEAQSKRLEKLREK 
HKEIRQPILDEKPKGEGSSSFLSETCHEDTSWF 
PNFTP 


134 


1484 


A 


1666 


1276 


466 


PGSTHASARITIY*L*IILSNATEVDNNFSKPPP 

FFPAGAPPASSSSSSSSSSPPTVSTAPPLIPPPGF 

PPPPGAPPPSLIPTIESGHSSGYDSRSARAFPYG 

NVAFPHLPGSAPSWPSLVDTSKQWDYYARSS 

SSSSSSSSSSSSSPRDRDRER*RTRERERERDHS 

PTPSVFNSDEERYRYREYAERGYERHRASRE 

KEERHRERRHREKEETRHKSSRSN SRRRHESE 

EGDSHRRHKHKKSKRSKEGKEAGSEPAPEQE 

STEATPAE 


135 


1485 


A 


1673 


1 


417 


PTRPVNSSQAFALVYYTLGALGGNLIAHMGL 
GYRYWAGIGVI OSCESALTHYRI VANHVA^ 
DISLTGGSVVQRIRLPDEVENPGMNSGMLQE 
DLIQYYQFLAEKGDVQAQVGLGQLHLHGGR 
GV*QNHQRAFDYFNLAA 


136 


1486 


A 


1678 


525 


9 


ANTSLSSAAVSAVSPPPCRTSTATTLPPPMPSF 

FCVFPSPSMSPSPSEFLSC1ASVSRVHSLSSSSS 

GSSSTASSLNFSAIMGSSSATASWVLSTASTPP 

CPSALPSSPAQES*SLAASSSAWPVAGISPSGA 

CTFPAGSASGAAKAPSPSWRCPSFRALFSLLD 

SSSLSL 


137 


1487 


A 


1680 


1 


2999 


AHRDEIQRKFDALRNSCTVITDLEEQLNQLTE 

DNAELNNQNFYLSKQLDEASGANDEIVQLRS 

EVDHLRRJEITEREMQLTSQKQTMEALKTTCT 

MLEEQVMDLEALNDELLEKERQWEAWRSVL 

GDEKSQFECRVRELQRMLDTEKQSRARADQ 

RJTESRQVVELAVKEHKAEILALQQALKEQK 

LKAESLSDKLNDLEKKHAMLEMNARSLQQK 

LETERELKQRLLEEQAKLQQQMDLQKNHIFR 

LTQGLQEALDRADLLKTERSDLEYQLENIQV 

LYSHEK\TCMEGTISQQTKUDFLQAKMDQPA 

KKKKVPLQYNELKLALEKEKARCAELEEALQ 

KTRJELRSAREEAAHRKATDHPHPSTPATARQ 

Q1AMSAIVRSPEHQPSAMSLLAPPSSRRKESST 

PEEFSRRLKERMHHNIPHRFNVGLNMRATKC 

AVCLDTVHFGRQASKCLECQVMCHPKCSTC 

LPATCGLPAEYVTHFTEAFCRDKMNSPGLQT 

KEPSSSLHLEGWMKVPRNMKRGQQGWDRK 

YiVLEGSKVLIYDNEAREAGQRPVEEFELCLP 

DGDVS1HGAVGASELANTAKADVPY1LKMES 

HPHTTCWPGRTLYLLAPSFPDKQRWVTALES 

WAGGRVSREKAEADAKLLGNSLLKLEGDD 

RLDMNCTLPFSDQWLVGTEEGLYALNVLK 

NSLTHVPGIGAVFQIYIIKDLEKLLMIAGEERA 

LCLVDVKKVKQSLAQSHLPAQPD1SPN1FEAV 

KGCHLFGAGKIENGLCICAAMPSKWILRYN 

ENLSKYCIRKE1ETSEPCSCIHFTNYSILIGTNK 

F YEIDMKQ YTLEEFLDKNDH SLAP A VFAA S S 

NSFPVS1VQVNSAGQREEYLLCFHEFGVFVDS 
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140 
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1491 



1492 



143 
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526 



1693 



1704 



1743 



1769 



1789 



376 



376 



362 



406 



YGRRSRTDDLKWSRLPLAFAYREPYLFVTHF 
NSLEVIE1QARSSAGTPARAYLDIPNPRYLGPA 
ISSGAIYLASSYQDKLRVICCKGNLVKESGTE 
HHRGPSTSRR*PASPLPQYQGQRAFLQGRRK 



GRPQGPAPGAGSPPESGPGLWAALGCSLVWV 
PLCCLGGAAGRL*ARSGKSGLRRRRAHAGPP 
PGGPCNSCP*CSAPESGGRGPLPGPGTGGVCS 
CWTRG CQTT ARTAAAAAAPGPAGRRPPGG A 
PQNG SCAA SASQEAAAPPPMCPPGRRW A VAS 
PPETRCPAAPGTRCRRLEAA 



LPSMSNCTSCFRLQSRTES*IRQAGHLLGRNE 
FIETKALGCAWFSLCYYLVLYFESSHKVDFVF 
IV*CFSTPPGAQNfTIMSQACAERCNlMRLVDR 
RW AGI AKG VGTQK11 GRVHLGEQKALGL 



ERTNKFIKEL1MDGKNLIAATKSLSVAQRKFA 
HSLRDFKFEF1GDAVTDDERC1DASLREFSNFL 
KNLEEQREIMVS*EGCKLISQLSRGKK1WIWK 
LVLVEVVKHLSLGTVVHCNGKMRFPEP 



L1TNKVFVARELSCLDVHLDSTGSTAWADQ 
DKLELELVLKGSYEDTQTSFLGTASAFRFHY 
MAAL*TELSGRLRSSKSNGWNGDNSTGYLTV 
PLRPLT1VKEVTMDVPAPNVRGLNWMG 



447 



1814 



1827 



26 



404 



NNPSTLPRGS*PMSPRTTMGRRRQRRREHKSS 

LSLASSTVGPGGQIVHTETTEVVLCGDPLSGF 

GLQLQGGIFATETLSSPPLVCFIEPDSPAERCG 

LLQVGDRVLSINGIATEDGTMEEANQLLRJDA 

ALAHKVV 



QMLRNGGDQNTVPD YHFADR1RELL* PTEDQ 

KNCIP*DTYLRPSALGNIVEEVTHPCSPGPCPA 

NELCEVNRKGCTSGDPCLPYFCVQGCKLGQA 

SDFIARQGTLIQVPSSAGEVECYKICSCGQSGL 

LENCMEMHCMDLPTDTSALVR 



448 



PGRRFRPRLSQAGTDSGS*VFPDSFPSAPAEPL 
PYFLQEPQDAYIVKNKPVELRCRAFPATQIYF 
KCNGEWVSQNDHVTQEGLDEATGLRVREVH 
1EVSRQQVEELFGLEDYWCQCVAWSSAGTTK 
SRRAYVRI 



XVEEKHADTWRSXCLSDFFFHAAKXLCXE*N 
CGDAISLSVGDHFGKGNGLTWAEKFQCEGSE 
THLALCPIVQHPEDTCIHSREVGWCSRYTDV 
RLVNGKSQCDGQVEINVLGHWGSLCDTHWD 
PEDARVLCRQLNCGTAL 



146 



1496 



1828 



574 



333 



147 



1497 



1855 



148 



1498 



149 



1499 



1879 



1880 



568 



611 



QHEGGDLRRRQLGEIQLTVRYVCLRAASAC* 
SMAAET* HHVPASG ADPYVR VYLLPERKW A 
CRKKTSVKRKTLEPLFDET 



372 



ERLVLTSEHCLVLTLFWPSWTYHTLLLSRQH 
VRRLPKLTHAEHDHLASIMNKLLTNYDNLFE 
TSVTYSMG*HGAPTGSEAGANWNH**LHAH 
YYPPLLRSDTVRKFMVGSQMLAQAQRDLTPE 



24 



LLSALDDKGGTQPSASFSNAPTIVCVTACPAG 

1AHTYMAAEYLEKAGRKLGVNVYVEKQGAN 

GIEGRLTADQLNSATACIFAAEVAIKESERFN 

GIPALSVPVAEPIRHAEALMQQALTLKRSDET 

RTVQQDTQPVKSVKTELKQALLSGISFAVPLI 

VAGGTQVA*AV*RQGISSLHDVQVRTWNS 

GLN SEN ALSNEAMERGWQCLRLFAERLQD IP 

PSQIRVVATATLRLAVNAGDFIAKAQEILGCP 

VQVI SGEEEARLIYQG VAHTTGG ADQRL V VP 
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IGG ASTEL VTGTG AQTT* LFSLSMGC VT WLER 
YFADRNLGQENFDAAQKAAREVLRPVADEL 
RYHSWKSVRGASVTVQALQEIMMAQGMDE 
RITMEIWPVD 


150 


1500 


A 


1894 


2 


750 


GRVDFFHTDYRPLIRDSNNYVLDEQTQQAPH 

LMPPPFLVDVDGNPHPTKYQRLVPGRENSAD 

EHLIPQLGYVATSDGEVIEQ1ISLQTNDNDERS 

PESS1LDGMIRQLQQQQDQRMGADQDTIPRG 

LSNGEETPRRGFRRLSLDIQSPPNIGLRRSGQV 

EGVROMHONAPRSOIATERDLOAWKRRWV 

PEVPLGIFRXLEDFRLEKGEEERNLYIIGRKRK 

TLQLSHKSDSVGLVSQSRPRTCRRKYP 


151 


1501 


A 


1900 


141 


785 


GKTIQIQTTMQNKYKTVQKQYKTIPKNKKA 

MEMQIKKQFQDTCKVQTKQYKALKNHQLEV 

TPKNEHKTILKTLKDEOTRKLAILAEOYEOS1 

NEMMASQALRLDEAQEAECQALRLQLQQEM 

ELLNAYQSKIKMQTEAQHERELQKLEQRVSL 

RRAHLEOKIEEELAALOKERSERIKNLLEROE 

REIETFDMESLRMGFGNLVTLDFPKEDYR 


152 


1502 


A 


1915 


2 


377 


LVRLLDTQRDGLQNYEALU3LTNLSGRSDKL 
RQKIFKERALPDIENYMFENHDQLRQAATEC 
MCNMVLHKEVOERFLADGNDRLKLVVLl CG 
EDDDKVQNAAAGALAMLTAAHKKJLCLKMT 
QVTT 


153 


1503 


A 


1921 


1 


237 


AYQSLRLEYLQIPPVSRAYTTACVLTSAAVQL 
ELriTFOLYFIPELIFKHFOIWRLITNFLFFVPFG 
FNFLLYMIFLYT 


154 


1504 


A 


1928 


2 


354 


EMVEGGEGKMCINTEWGGFGDNGCIDDIRTR 
YDTEVDEGSLNPGKORYEKMTSGMYLGEIV 
RQILIDLTKQGLLFRGQISERLRTOGIFETKFLS 
QIESDRLALLQVRRILQQLGLD 


155 


1505 


A 


1929 


2 


369 


TEIAKIKMEAKKKYEKELTMFONDFEKACOA 
KSEALVLREKSTLERIHKHQEIETKE1YAQRQ 
LLLKDMDLLRGREAELKQRVEAFESYQLELK 
DDYIIRTYRLIEDDRINIQISGHWQESP 


156 


1506 


A 


1935 


1 


270 


VTRKLPIFIVDAFTARAFRGSPAADCLLENEL 
DEDMHOKIAREMNL SETAFIRKLHPTDNF AO 
RSCFGLIWFTPTTDLQILTSSILPSIL 


157 


1507 


A 


1936 


584 


305 


ESKVNNEKFRTKSPKPAESPQS ATKQLD QPTA 
AYEYYDAGNHWCKDCOTICGTMFDFFTHMH 
NKKHTQGQFQKSSDFQKEELQQTFLPPERQG 


158 


1508 


A 


1939 


1 


423 


TTHRLNVTAEPPCTSMPIYWMPDVPHRCTTA 

NTCPVDLTDYCAQNGFYCLVYGFLPYGSLED 

RLHCQTQACPPLSWPQRLDILLGTARA1QFLH 

QDSPSLIHGDIKSSNVLLDERLTPKLGDFGLA 

RFSRFAGSSPIQSSM 


159 


1509 


A 


1974 


3 


401 


HTSTARLLLHRGAGKEAVTSDGYTALHLAAR 

NGHLATVKLLVEEKADVLARGPLNQTALHL 

AAAHGHSEVVEELVSADVIDLFDEQGLSALH 

LAAQGRHAQTVETLLRHGAHINLQSLKFQGG 

HGPAATLLR 


160 


1510 


A 


1982 


2 


417 


KFLKDLEKQYNKEEPHLSE1GSCFLQNQEGFA 
IYSEYCNNHPGACLELANLMKQGKYRHFFEA 
CRLLQQMIDIAIDGFLLTPVQK1CKYPLQLAEL 
LKYTTQEHGDYSNIKAAYEAMKNVACLINER 
KRKLESIDK1A 


161 


1511 


A 


1984 


4 


770 


RETGSVSLSPSGLEGAESYAVSPILYSSPDVKE 
LWLETLQGQRHSHTGVKSTPGQSAAILMKLR 
SSHNASKTLNANNMETLIECQSEGDIKEHPLL 
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ASCESEDSICQUEVKKRKKVLSWPFLMRRLS 
PASDFSGALETDLKASLFDQPLSIICGDSDTLP 
RPIQDILTILCLKGPSTEGIFRRAANEKARKEL 
KEELNSGDAVDLERLPVHLLAVVFKDFLRSIP 
RKLLS SDLFEEWMGALEMQDEEDRJEALK 


162 


1512 


A 


1986 


864 


501 


LLNSGLFSAPDGSNLEMRLTRGGNMCSGRIEI 
KFQGRWGTVCDDNFNIDHASVICRQLECGSA 
VSFSGSSNFGEGSGPIWFDDLICNGNESALWN 
CKHQGWGKHNCDHAEDAGVICSSKD 


163 


1513 


A 


2001 


419 


187 


A VDL SIDESSLTGETTPCSKVTAPQPAATNGD 
LASRSN1AFMGTLVRCGKAKGVVIGTGENSE 
FGDIINLSTFWHS 


164 


1514 


A 


2012 


284 


597 


SLLCLFPGTSTVVCKPIVIETQLYVIVAQLFGG 
SHIYKRDSFANKFlKIQAIEILKIRKPNDlETFKi 
ENNW YF W AD S SKAG FTTIY KWERETGF YSH 
QSFTR 


165 


1515 


A 


2013 


2 


403 


EDPEELGHFYDYPMALFSTFELFLT7IDG PAN Y 
NVDLPFMYSITYAAFAIIATLLMLNLLIAMMG 
DTH WRV AHERDELWRAQIVATTVML ERKLP 
RCLWPRSGICGREYGLGDRWILRVEDRQDLN 
RQRIQRYA 


166 


1516 


A 


2019 


2 


927 


CCQREGLGLKAWQILLSHGRNGLPGEPASS 

QGLSAASSTPVFHLALQIDSAPDNIDWVEMLF 

NKNMVTERLQNVMVLEQCFSDSSSLYRFLTY 

SYLLAFNVWLLLAPVTLCYDWQVGSIPLVETI 

WDMRNLAT1FLAVVMALLSLHCLAAFKRLE 

HKEVLVGLLFLVFPFIPASNLFFRVGFVVAER 

VLYMPSMGYCILFVHGLSKLCTWLNRCGATT 

LIVSTVLLLLLFSWKTVKQNEIWLSRESLFRS 

G VQTLPI INAK VHYN Y ANFLKDQGRNKEAI Y 

H Y RT ALNNNKA WD YLC WRFRKTLTDLP 


167 


1517 


A 


2025 


696 


71 


AAASAASSLTVTLGRLASACSHS1LRPSGPGA 

ASLWSASRRFNSQSTSYLPGYVPKTSLSSPPW 

PEVVLPDPVEETRHHAEWKKVNEMIVTGQY 

GRLFAWHFASRQWKVTSEDLILIGNELDLA 

CGERIRLEKVLLVGADNFTLLGKPLLGKDLV 

RVEATVIEKTESWPRJIMRFRKRKNFKKKRIV 

TTPQTVLRTNSIEIAPCLL 


168 


1518 


A 


2046 


2 


366 


HLQVAARVFMPLQAVDSAPKPLKGQAQAPQ 
RLQGAARVFMPLQAQVKAKASKPLQMQIKA 
PPRLRRAARVLMPLQ AQVRAPRLLQ VQ SQVS 
KKQQAQTQTSEPQDLDQVPEEFQGQDQVLR 


169 


1519 


A 


2049 


1 


945 


QNLEDREVLNGVQTELLTSPRTKDTLSDMTR 

TVEISGEGGPLGIHVVPFFSSLSGRILGLFIRGI 

EDNSRSKilEGLFH ENECI VKfNNVDL VDKTFA 

QAQDVFRQAMKSPSVLLHVLPPQNREQYEKS 

VIGSLN1FGNNDGVLKTKVPPPVHGKSGLKTA 

NLTGTDSPETDASASLQQNKSPRVPRLGGKPS 

SPSLSPLMGFGSNKNAKJCIKIDLKKGPEGLGF 

TVVTRDSSIHGPGPIFVKNILPKGAAIKDGRLQ 

SGDR1LEVNGRDVTGRTQEELVAMLRSTKQG 

ETASLVIARQEGHFLPRELVMFRSQSH 


170 


1520 


A 


2050 


363 


1 


PVATHLTKILNSDEHAVVISSAKTLCETVKDF 
VAKVEKTYDKTLENAWADAVASKCSVLNE 
KLEQLLQALHTDSQAAPVLPGLSPLIVEEDAV 
ESSSEESLGESKEQLGDDVTKPSSQKA 


171 


1521 


A 


2055 


139 


675 


IPSRPWLGRITGLDPAGPLFNGKPHQDRLDPS 
DAQFVDVIHSDTDALGYKEPLGNIDFYPNGG 
LDQPGCPKTILGGFQYFKCDHQRSVYLYLSSL 
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RESCTITAYPCDSYQDYRNGKCVSCGTSQKE 
SCPLLGYYADNWKDHLRGKDPPMTKAFFDT 
AEESPFCMYIIYFVD1ITWNKNVR 


172 


1522 


A 


2056 


3 


361 


UQHKSAVEYAQSHLSLVSMCKESHKCSEPK 
MEWKVKIRSDGTRYITKRPVRDR1LKERALK1 
. FCEERSGLTTDDDTMSEMKMGRY WSKEERKQ 
HLVRGKEQRRRREFMMRIRLKCLKES 


173 


1523 


A 


2060 


1 


387 


GTRILSMQIPFVGFQPIRTSEHMAAAGVFALL 
QAYAFLQYLRDRLTKQEFQTLFFLGVSLAAG 
AVFLSVIYLTYTGYIAPWSGRFYSLWDTGYA 
KIHIPIIASVSEHQPTTWVSFFFDLHILGCTFPA 
G 


174 


1524 


A 


2071 


74 


443 


LLMGPKAKKSGSKKKKVTKAERLKLLQEEEE 
RRLKEEEEARLKYEKEEMERLEIQRIEKEKW 
HRLEAKDLERRNEELEELYLLERCFPEAEKLK 
QETKLLSQWKHYIQCDGSPDPSVAQENfNT 


175 


1525 


A 


2083 


139 


486 


AALTWSQPQEFWPMEMQPIVTDMVTVHWV 
AESSTVGWLCALFRVTHVGVGATGHGVVCG 
RRVLCGLPLPSPAPMPIMSLPEGESRKEREVQ 
RLQFPYLEPGHELPATTLLAFLAAV 


176 


1526 


A 


2092 


3 


587 


EGSVNFKFGVLFAKDGQLTDDEMFSNEIGSEP 

FQKFLNLLGDTITLKGWTGYRGGLDTKNDTT 

GIHSVYTVYQGHEIMFHVSTMLPYSKENKQQ 

VERKRHIGNDIVTIVFQEGEESSPAFKPSMIRS 

HFTHIFALVRYNQQNDNYRLKIFSEESVPLFG 

PPLPTPPVFTDHQEFRDFLLVKL1NGEKATLET 

PCI 


ill 


1527 


A 


2103 


44 


427 


GKGQVSLEGRPHRGPLCLGSWWPGSRVPGC 
CDG A WL AW AC WVFGNDFPS PAS AAC S ALL G 
CSVSTACLCVPLCSGSPLAPFRRTAALQEGLR 
RAVSVPLTLAETVASLWPALQELARCGNLAC 
RSDLQ 


178 


1528 


A 


2104 


2 


409 


ALQSTLGAVWLGLLLNSLWKVAESKDQVFQ 
PSTAASSEGAWEIFCNHSVSNAYNFFWYLHF 
PGCAPRLLVKGSKPSQQGRYNMT YERFSS SL 
LILQVREADAAVYYCAVEVPNTDKLIFGTGT 
RLQVFPNIQNPD 


179 


1529 


A 


2111 


1 


312 


PTRSSTRPPSLFVHASAKGGEKEEGDDGHYL 
MRTESHTGLKKGGNANLVFMLKRNTEPKKG 
SYHFDLERLRAAHILFEREQEHLAPGGISMPL 
PPPLPLPACLG 


180 


1530 


A 


2116 


3 


366 


TSIKRAIETTDVTR SFG WDSSEA WQQHD VQE 
LCRVMFDALEQKWKQTEQADLINELYQGKL 
KDYVRSLECGYEGWRIDTYLDIPLVIRPYGSS 
QAFASWCTFHLTACVSLHRIHNSTW 


181 


1531 


A 


2117 


2 


386. 


YGLGAHFGRLFIQAGINENDFYDGAWCAGR 

NDLQQWIEVDARRLTRFTGVITQGRNSLWLS 

DWVTSYKVMVSNDSHTWVTGKNGSGDMIFE 

GNSEKEIPVLNELPVPMVARYIRINPQSWFDN 

GSICI 


182 


1532 


A 


2123 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFWAVNAVH 

GWVLGKIMCKITSALYTLNFVSGMQFLACISI 

DRYVAVTKVPSQSGVGKPCWIICFCVWMAAI 

LLSIPQLVFYTVNDNARCIP1FPRYLGTSMKAL 

IQMLEICIGFVVPFLIMGVCYFITARTLMKMP 

NIKIS 


183 


1533 


A 


2140 


3 


561 


RQAWHEAFKVRKE1LTV1CCLLAFCIGL1FVQ 
RSGNYFVTMFDDYSA1LPLLIVVILENIAVCF 
VYGIDKFMEDLKDMLGFAFSRYYYYMWKYI 
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SPLMLLSLLIASWNMGLSPPGYNAW1EDKAS 

EEFLSYPTWGLAVCASLDVFA1LPVPVAFIGR 

RFSLIDDGAGPFCSAAYTTTGCRTPYL 


184 


1534 


A 


2145 


3 


538 


HELTVAAADRGQPPQSSWPVTVTVLDVND 

NPPVFTRASYRVTVPEDTPVGAELLHVEASD 

ADPGPHGLVRFTVSSGDPSGLFELDESSGTLR 

LAHALDCETQARHQLWQAADPAGAHFALA 

PVTIEVQDVNDHGPAFPLNLLSTSVAENQPPG 

TL VTTLHAIDG DAG AFGRLRYHL 


185 


1535 


A 


2151 


2 


671 


LDKLLDRMENYNIFNEYILKQVAATYIKLGW 

PKNNFNGSLVQASYQHEELRREVIMLACSFG 

NKHCHQQASTLISDWISSNRNRJPLNVRDIVY 

CTGVSLLDEDVWEFIWMKFHSTTAVSEKK1L 

LEALTC SDDRNLLNRLLNLSLNSE WLDQD AI 

DVHHVARNPHGRDLAWKJFFRDKWKILNTRJ 

RQKTLEFDFAEPLILAFPIILYTA1DNPPLVREH 

E 


186 


1536 


A 


2153 


2 


400 


GPMCDKHS AF AEKFHAGF1D YI VHPL WET WA 
HLALPDAQDILYTLEDNRNWVDSMIPQSPSPP 
LDEQNRDWQGLLENLHVELTLDEEDSEGPEK 
EGEGQTYFTSSKTLCGIVPQNTDSLGETGIHIC 
AHDKSP 


187 


1537 


A 


2158 


227 


442 


FNCFRV ASDSFLENS SLLIMILPLRNATQEFI IR 
PGAVAYTCNPSTLGGWGGW1TRSGVRDQPG 

QHGGTPS 


188 


1538 


A 


2167 


3 


486 


AHLGGAWLTQRSLGSWAAPGPARAAKEWA 
CIPQNQKMNI WRMKXSKHLQLLSF VLGAVSP 
A WVP YMM VLQENG YG VEEG IPTL LMAA S S 
MDDILAITGFNTCLSIVFSSGCARSSGSRNSKS 
LRTPLGTICEGCDDSSIFSHLDHSSKWSSTYG 
HSGA 


189 


1539 


A 


2168 


2 


412 


EFLSSNQITQLPNTTFRPMFNLRSVDLSYNKL 

QALAPDLFHGLRKXTTLHMRANAIQFVPVRIF 

QDCRSLKFLDIGYNQLKSLAKNSFAGLFKLTE 

LHLEHNDLVKVNFAHFPRLISLHSLCLRRNKV 

AIWSSLDW 


190 


1540 


A 


2179 


64 


399 


MRLNQNTLLLESFGXXRPYTSEHAPTYHQW 
MKADELLRWTTSEPLTLEHEYAMQRTWLED 
AYECTFIVLDAEKRHAQPGATEESCMVGDVN 
LFLTDLEDLTLGEIEVLIAEP 


191 


1541 


A 


2190 


1 


469 


CLDRAAGIRHERNVIYINETHTRHRGWLARR 
LSYVLFIQERDVHKGMFATNVTEN VLN S SRV 
QEAIAEVAAELNPDGSAQQQSKAVNKVKKK 
AKRILQEMVATVSPAMIRLTGWVLLKLFNSF 
FWNIQIHKGQLEMVKAATETNLPLLFLPVHR 
SH 


192 


1542 


A 


2197 


26 


157 


PSKXGGIRLLLTGTQLYGRFGSA1APLGDLDR 
DGYNGEGREEPY 


193 


1543 


A 


2236 


2 


383 


EYFPNSIWRSLFSTMDLGDIGFYTYRILQALS 
YTHSKGIMHRDVKPLN1LCNSPRNKVILADW 
G L AEFYHPMRKYS VHVATRYYKSPEILLDYE 
YYDYSLDIWAVGVILLELLTLKLHVFEGGDN 

EQ 


194 


1544 


A 


2241 


105 


409 


RKGVGKMPTSEGRPGQERSDWVTSYKVMGS 
NDSHTWVTVKNGSGDMIFEGNSEKEIPVLNE 
LPWMGARYIRINPQSWFDNGSICMRMEILGC 
PLPDPNNY 


195 


1545 


A 


2245 


1 


672 


MGVASDWTKRIEYQPGSGSMPLFPSIHLETCD 
GAVSSLQIVTELQTNY1GKGCDRETYSEKSLQ 
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KLCGASSGIIDLLPSPSAATNWTAGLLVDSSE 
MIFKFDGRQGAKIPDGIVPKNLTDQFTITMW 

MKJrnjrbrU VKAfcKJb J ILL- Y bJJiv 1 blWINKriti Y 

ALYVHNCRLVFLLRKDFDQADTFRPAEFHW 

KLDQQALAKVDGQPGKSITRQLQEMPVTIQG 

ISLKPS 


196 


1546 


A 


2256 


1 


396 


FRGTPVSGLTNRDTLAVJRHFREP1RLKTVKP 
GKVINKDLRHYLSLQFQKGSIDHKLQQVIRD 
NL YLRTIPCTTRAPRDGE VPG VD YN r IS V EQr 
KALEESGAELESGTYDGNFYGTPKPPAEPSPF 
QPDPV 


197 


1547 


A 


2259 


43 


594 


QLAIEIGVRALLFGVFVFTEFLDPFQRVIQPEEI 
WLYKNPLGQSDNIPTRLMFAISFLTPLAVICV 

T, TTS TlTi T\ T^Pil/TF TT/ T~* A T^T A T F^tT AT AT VT/^T J f~> r l "»K T jF *T , T 

VKnRRTDKTEIKEAFLAVSLALALNGVCTNTI 
KLIVGRPRPDFFYRCFPDGVMNSEMHCTGDP 
DLVSEGRKSFPS1HSSF AFSGLG F 1TF YL AGKL 
HCFTESGRGKSWRLCAAILPL 


198 


1548 


A 


2275 


3 


404 


TCTTV WIPRMLVDFLSESKTI SLPECATQMFF 

FLGFASNNCnMAAMSYDRYTAIHNPLQYHT 

LMTRKICLQMMMASWMVGFLFSLCIIVTVFN 

LSLCDLNTIQHYFCDISPVVSLACNYTFYHEM 

AIFVLSA 


199 


1549 


A 


2315 


1 


375 


LTQMFFIHALS AIESTILLAMAFDRYVAI CHPL 
RHAAVLNNTVTAQ1GIVAWRGSLFFFPLPLL1 
KRLAFCHSNVLSHSYCVHQDVMKLAYADTL 
PNWYGLTAILLVMGXDRMFISLSYFLII 


200 


1550 


A 


2334 


2 


409 


PRVRPQQRKMSFFFKTELGEKLVTKFLFETDF 
SDDPMLPSPDQLKXKAPFTNKKLKAHQTPVD 

TT T f /"XT 7 ATT At A I"^ - * /AT FA A T 1^*1 v"** V T A ^ TTTY\ T\ A 'VTXIT^ 

ilkqkahqlasmqvqaynggnanprpanne 
eeedeede ydyd ye slsddniledrpenksch 
dqlqfeykeem 


201 


1551 


A 


2350 


3 


512 


ISWEAQIAEIIQWVSDEKDARGYLQALASKM 
TEELEALRSSSLGSRTLDPLWKVRRSQKLDM 
SARLELQSALEAEIRAKQLVQEELRKVKDAN 
LTLESKLKDSEAKNRELLEEMEILKKKMEEK 
FRADTGKLMLCDSALFEYKYFSNECFYFLFD 
LIVTLEAPTEFQIQY 


202 


1552 


A 


2351 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLL 

NRVLERLAGGATRDSAASDILLDDIVLTHSLF 

LPTEKFLQELHQYFVRAGGMEGPEGLGRKQA . 

CLAMLLHFLDTYQGLLQEEEGAGHHKDLYL 

LI MKD E S L YQGLREDTLRLH QL VET VELK1PE 

ENQPPSFCQVKPLFRHFRR1DSCLQTRVAFRGS 

DHFCRVYMPDHSYVTIRSRLSASVQDILGSV 

TEKLQ Y S EEPAG RED SLIL V A V b b bGEK V LL Q 

PTEDCVFTALGINSHLFACTRDSYEALVPLPE 

EIQVSPGDTE1HRVEPEDVANHLTAFHWELFR 

CVHELEr VD Y VrHuE 


203 


1553 


A 


2361 


2 


403 


NNLNC AEPLFEQNNSLNVNFNTQKKTVW LIH 
G YRP VG S IPL W L QN F VRILLN EEDMN VI W D 

WQPfiATTFIVkTR A VIOdTRKV A V<II QVHTT^TsJT 

W oix\J/\ 1 1 ll I INJWA V rVlN 1 JVIV VA V OLtO V Ail rvl> L. 

LKHGASLDNFHF1GGSLGAH1SGFVGKIFHGQ 
LGRITGLDP 


204 


1554 


A 


2390 


280 


476 


SPSLLPQCLMSLSDLSLSPAPPSHLSPRCPSPQ 

AGSRLGAMRRCAREMDATPMPPAPSCPSERV 

T 


205 


1555 


A 


2400 


543 


745 


AAVALRD1SWQQPYPMDFYAGSSLGPWTVN 

HGQDRRPHAPGRPARGKVQEGSARPPSAVAC 

EDCSCR 
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206 


1556 


A 


2406 


122 


485 


DLSPDSREDHPQGHRRLLPKRPVRGSLMPGH 
THHPCPVSSTTNDTPDQIWV SVG SLRMGTGG 
MGANASTSPRCWDLSSGNKKWIIQVPILASIV 
ESRGGLLATGVGGMCACVPRNQPLTGT 


207 


1557 


A 


2409 


289 


418 


LWTLYRHKQQVQHNHSNRLSCRPSQEDRAT 
HTIMVLDKENTLS 


208 


1558 


A 


2413 


64 


492 


VQGTGXXF1AFTEAMTHFPASPV W AGM FFI , 

MLJNLGLGSMlGTMAGrrrPIIDTFKVPKEMFT 

GGCCVFAFLVGLLfVQRSGNYFVTMFDDYSA 

TLPLTLIVILENIAVAWIYGTKKFMQELTEML 

GFRPYRFYFYMWKFVSP 


209 


1559 


A 


2417 


3 


877 


EKERLLDEWFTLDEVPKGiCLHLRLEWLTLMP 

NASNLDKVLTDIKADKDQANDGLSSALL1LY 

LDSARNLPIRYKTNEPVWEENFTFFIHNPKRQ 

DLEVEVRDEQHQCPLGNLKVPLSQLLTSEDM 

TVSQRFQLGNSGPNSTIKMKIALRVLHLEKRE 

RPPDHQH SAQVKRPSVSKEGRKTSIKSHMSG 

SPGPGGSNTAPSTPVIGGSDKPGMEEKAQPPE 

AGPQGLHDLGRSSSSLLASPGHIS VKEPTPSI A 

SDISLPIATQELRQRLRQLENGTTLGQSPLGQI 

QLTIP 


210 


1560 


A 


2422 


35 


456 


REFAASDLEPFTPTDQPISPEAITQPSCIKRQRA 

AGNPGSLAATIDHKPCSAPLEPKIQASRNQRW 

GAVRAAESLTDIAEPASPQVHETPIDASQTQK 

VEPASKSRFTPELQAKVSHSRERALSTMDATP 

HHAQPQRGEG 


211 


1561 


A 


2431 


1 


764 


RRYSQKLJQHTACQiXRTYPAATRlDSSNPNP 

LMFWLHGIQLVALNYQTDDLPLHLNAAMFE 

ANGG CG Y VLKPPVL WDKNCPM YQKF SPLER 

DLDSMDPAVYSLTIVSGQNVCPSNSMGSPCIE 

VDVLGMPLDSCHFRTKPIHRNTLNPMWNEQF 

LFHVHFEDLVFLRFAWENNSSAVTAQRIIPL 

KALKRGYRHLQLRNLHNEVLEISSLFINSRRM 

EENSSGNTMSASSMFNTEERKCLQTHRVTVH 

GVPG 


212 


1562 


A 


2436 


1 


411 


GIRGTTGHLGCPINDDPSLTLTVSWVMEDKP1 

YIGNGTKKEDDSLTIFAVAKRDHVSDTCGAC 

TDLDHNLDKGYLTVLGEQATPTNRLGALPKG 

RANRTRDLELTYLAER1VRLTWIPGDANNRPI 

TDYDCQIEEHQ 


213 


1563 


A 


2445 


1 


1294 


MSSIGCLWVSRSSQ1DGLTAEKSGPEKPHGT 

WLMPELHPKEQILELLVLEQFLSILPEELQIWV 

QQHNPESGEESVTLLEDLEREFDDPGQQVPAS 

PQGPAVPWKDLTCLRASQESTDIHLQPLKTQ 

LKSWKPCLSPKSDCENSETATKEGISEEKSQG 

LPQEPSFRGISEHESNLVWKQGSATGEKLRSP 

SQGGSFSQVIFTNKSLGKRDLYDEAERCLILT 

TDSIMCQKVPPEERPYRCDVCGHSFKQHSSLT 

QHQRIHTGEKPYKCNQCGKAFSLRSYLIIHQR 

IHSGEKA YECSECGKAFNQSSAL IRHRKIHTG 

EKACKCNECGKAFSQSSYLIIHQR1HTGEKPY 

ECNECGKTFSQSSKLIRHQRIHTGERPYECNE 

CGKAFRQSSELITHQRIHSGEKPYECSECGKA 

FSLSSNL1RHQRIHSG 


214 


1564 


A 


2461 


1 


615 


GIPGSTISSSRNIFLEDDLAWQSUHPDSSNTPL 
STRLVSVQEDAGKSPARNRSASITNLSLDRSG 
SPMVPSYETSVSPQANRTYVRTETTEDERKIL 
LDSVQLKDLWKKICHHSSGMEFQDHRYWLR 
THPNCIVGKELVNWLIRNGHIATRAQAIAIGQ 
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AMVDGRWLDCVSHHDQLFRDEYALYRPLQV 
LFSVYCQLECSKLIL 


215 


1565 


A 


2464 


3 


2932 


GPGVRSSQDGMADVFVHLRTAWPRCSFISGQ 

HGPGRHGRRVCSSQD SMADVFVHLRTAWPT 

CSLISGQHGPGESVSYEDDDIPAPASLLHVNA 

AAPALTNPTAPVLCTAPNNTAQKEKVPSGMR 

QRPAGVRISSRTPDLTCAVSTHSTVPGVRISSC 

TPDLTCAVS1HSTVPSVCISSCTPDLTCAVSTH 

STVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVRISSCTPDLTCAVSIH 

ATVPGVR1SSCTPDLTCAVSTHSTVPGVR1SSR 

TPDLTCAVSIHSTVPGVRISSCTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVR1SSR 

TPDLTCAVSIHATVPGVRISSRTPDLTCAVSIH 

ATVPGVR1SSCTPDLTCAVSIHATVPGVRISSC 

TPDLTCAVSIHATVPGVRISSRTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVRISSCTPDLTCAVSTH 

STVPGVR1SSRTPDLTCAVSIHATVPGVHISSC 

TPDLTCAVSTHSTVPGVRISSRTPDLTCAVSIH 

STVPGVCISSRTPDLTCAVSIHSTVPSVHISSCT 

PDLTCAVSIHSTVPGVRISSRTPDLTCAVSTHS 

TVP/Tl/'I-lJCQf ,, TTTiT Tf^ A VQfU A T\/DfW/TJlOQ/~' r r 
I VrVjvrllo&^I J UljlL-AvDlrlAl VrO VriloaC J 

PDLTC AVSTHTT VPGVRI SSRTPDLTCAVSIHS 

TVPGVRISSCTPDLTCAVSTHSTVPGVRISSRT 

PDLTCAVSTHLTVPGVRISSRTPDLTCAVSIHA 

TVPGVMSSCTPDLTCAVSIHATVPGVR1SSRT 

Liu/Li i v-rvv oixi/\ l v r vj v nioo^ i ruiL i lavo i no 

TVPGVRISSRTPDLTCAVSIHSTVPGVHISSCT 

PDLTCAVSTHSTVPGVHISSCTPDLTCAVSTH 

STVPGVHISSRTPDLTCAVSIHATVPSVHISSC 

TPDLTCAVSIHSTVPGLLTSVSQTSTG 


216 


1566 

A \J\J 


A 


2477 


1 


414 


FUTK YR K n*? YP CI VS F WT A ROOTS! WOFlOF K 

AVEVATWIQPTVLRAAVPKNVSVAEGKELD 
LTCNin'DRADDVRPEVTWSFSRMPDSTLPGS 
RVLARiDRDFLVHSSPHVALSHVDARSYHLL 
VRDVSKENSGYYY 


217 


1567 


A 


2480 


2 


460 


CRTLCEGPQRFEEYEYLGYKAGLYEA1ADHY 
MQ VL VCQHEC VREL ATRPGRL SPIENFLPLHY 
DYLQFAYYRVGEYYKALECAKAYLLCHPDD 
EDVLDNVDYYESLLDDSIDPASIEAREDLTMF 
VKRHKLESELIKSAAEGLGXSYTEPNYW 


218 


1568 


A 


2483 


140 


383 


AFSSPHPSPAPQFPECGFYGLYDKILLFKHDPT 
SANLLQLVRSSGDIQEGDLVEVVLSASATFED 
LQ1RPHALTVHSYRAP 


219 


1569 


A 


2489 


3 


428 


SSRLVLLAGAAALASGSQGDREPVYRDCVLQ 

CEEQNCSGGALNHFRSRQPIYMSLAGWTCRD 

DCKYECMWVTVGLYLQEGHKVPQFHGKWP 

FSRFLFFQEPASAVASFLNGLASLVMLCRYRT 

FVPASSPMYHTCVAFAWVS 


220 


1570 


A 


2498 


1 


1297 


MDGEAVRFCTDNQC V SLHPQEVDS V AMAP A 

APKIPRLVQATPAFMAVTLVFSLVTLFVVDH 

HIIFGREAEMRELIQTFKGI^MENSSAWVVEIQ 

MLKCRVDNVNSQLQVLGDHLGNTNADIQMV 

KGVLKDATTLSLQTQMLRSSLEGTNAEIQRL 

KEDLEKADALTFQTLNFLKS SLENTSIELH VL 

SRGLENANSE1QMLNASLETANTQAQLANSS 

LKNANAEIYVLRGHLDSVNDLRTQNQVLRNS 

LEGANAEIQGLKENLQNTNALNSQTQAFIKSS 
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FDNTSAE1QFLRGHLERAGDE1HVLKRDLKM 
VTAQTQKANGRLDQTDTQIQVFKSEMENVN 
TLNAQIQVLNGHMKNASREIQTLKQGMKNA 
S ALTSQTQMLD SNLQKAS AEIQRLRGDLENT 
KALTME1QQEQSRLKTLHWITSQEQLQRTQ 


221 


J571 


A 


2501 


3 


500 


RVRLN>©GLSPmMAAKTGKIGIFQHl]RREV 

TDEDTRHLSRKFKDWAYGPVYSSLYDLSSLD 

TCGEEASVLEILVYNSKJENRHEMLAVEPINE 

LLRDKWRKFGAVSFYINWSYLCAMVIFTLT 

AYYQPLEGTPPYPYRTTVDYLRLAGEVITLFT 

GVLFFFTN 


222 


1572 


A 


2508 


3 


395 


DAHCQRKLAMQEFMEINERLTELHTQKQKL 

ARHVRDKEEEVDLVMQKVESLRQELRRTER 

AKKELEVHTEALAAEASKDRKLREQSEHYSK 

QLENELEGLKQKQISYSPGVCSIEHQQEITKL 

KTDLEKKS 


223 


1573 


A 


2544 


2 


412 


NDPAIISNFSAAVVHTIVNETLESMTSLEVTK 

MVDERTDYLTKSLKEKTPPFSHCDQAVLQCS 

EASSNKDMFADRLSKSIIKHS1DKSKSVIPNID 

KNAVYKESLPVSGEESQLTPEKSPKFPDSQNQ 

LTHCSLSAA 


224 


1574 


A 


2552 


401 


1 


GASLCFISTAFTVLTFLIDSCRFSYPERPIIFLSM 

CYN1YS1AYIVRLTVGRERISCDFEEAAEPVLI 

QEGLKNTGCAIIFLLMYFFGMASSIWWVILTL 

TWFLAAGLKWGHEAIEMHSSYFHIAAWAIPA 

VK 


225 


1575 


A 


2563 


724 


1 


MSARKERREKGEEEGEGEKDGDEDEKEEEKE 
GLGEEEEKEAGKKKKKQEEKEKEKGAVYSR 
VARICKNDMGGSQRVLEKHWTSFLKARLNC 
SVPGDSFFYFDVLQSITDIIQINGIPTVVGVFTT 
QLNS1PGSAVCAFSMDDIEKVFKGRFKEQKTP 
DSVWTAVPEDKVPKPRPGCCAKHGLAEAYK 
I SIDFPDETLSFIKSHPLMDSA VPPIADEPWFT 
KTRVRYRLTA1 S VDHS AGPYH 


226 


1576 


A 


2571 


449 


3 


EGVLFVYGNYVGDVMNFEMAAEMAQEVAJP 
TRTVLTTDD1SSSP1 EDRDGRRGVAGNFF1FKV 
AGAACDRGMSLEACEAVTRKANRRTYTMG 
VALEPCSLPQTRRHNFEIGAEEMEIGMGIHGE 
RGV IREKMMP ADAJVDHIMDRIFS 


227 


1577 


A 


2575 


3 


1197 


VLSDLCLFYYRDEKEEGILGSILLPSFQIALLTS 

EDHINRKYAFKAAHPNMRTYYFCTDTGKEM 

ELWMKAMLDAALVQTEPVKRVDKITSENAP 

TKETNNIPNHRVLIKPEIQNNQKNKEMSKIEE 

KKALEAEKYGFQKDGQDRPLTKINSVKLNSL 

PSEYESGSACPAQTVHYRP1NLSSSENKIVNVS 

LADLRGGNRPNTGPLYTEADRVIQRTNSMQQ 

LEQW1K1QKGRGHEEETRGV1SYQTLPRNMPS 

HRAQ1MARYPEGYRTLPRNSKTRPESICSVTP 

STHDKTLGPGAEEKRRSMRDDTMWQLYEW 

QQRQFYNKQSTLPRHSTLSSPKTMVNISDQT 

MHS1PTSPSHGS1AAYQGYSPQRTYRSEVSSP1 

QRGDVTIDRRHRAHHPKVK 


228 


1578 


A 


2583 


3 


330 


LPFLGLGSVLPQGMVMASPEMNPT1CSVFEA 
HIVLLFHATTFRRGFQVTVLVGNVRQTAWE 
KIHAKVRGTWPFISPEVRKEGGLPQTGRELLD 
PTMGIKPHLWWVAA 


229 


1579 


A 


2589 


1 


448 


DDKNAQG1KRHVKPTSGNAFTICKYPCGKSR 
ECVAPN I CKCKPG YI G SNCQTAL CDPDCKNH 
GKCIKPNICQCLPGH GGATCDEEHCNPPCQH 
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GGTCLAGNLCTCPYGFVGPRCETMVCNRHC 
ENGGQCLTPDICQCKPGWYGPTCSTA 


230 


1580 


A 


2593 


2 


138 


AVTFSWFAYVADITQEHERSMAYGLVCMFI 
LYLLYLLRNAFFLR 


231 


1581 


A 


2595 


185 


2 


SGPYTDFTPWPTEEQKLLEQALKTYPVNPPER 
WEKIAEA VPGRTKKACIKRYKV ADLRI SK 


232 


1582 


A 


2596 


1 


391 


STVTGQPRRLLDTAGHQQPFLELKIRANEPGA 
GRARRRTPTCEPATPLCCRRDHYVNFQELGW 
RDWILLPEG YQLNYCSGQCPTHLAGSPG I AA S 
FHSAVFSLLKANNPWPGRTSWCVPTARRPLS 
LLYL 


233 


1583 


A 


2601 


184 


403 


LLFSDEIIMAAPLRIADVTSGLIGGEDGRVYV 

YNGKETTLGDMTGKCKSW1TPCPEEKVNVLQ 

NSIPYWERIT 


234 


1584 


A 


2614 


178 


335 


PLTLCLPENNKPPQADAVPDKELTLPVDST'll, 
DGSKSSDDQK1ISYLWEKTQ 


235 


1585 


A 


2616 


2 


896 


DVLEVYGTGVASTRHEMGTLDKHKELEDLV 

AKFLNVEAAMVFGM GFATNSMNIPALVGKG 

CULRDEVNHTSLVLGARLLGAT1GIFKHNYA 

QSLEKLLRDAVIYGQPRTRRAWKXILILVEGV 

YSMEGSIVHLPQIIAJLKKKYKAYLYIDEAHSI 

GAVGPTGRGVTEFFGLDPHEVDVLMGTFTKS 

FGASGGYIAGRKARJLSPPACLVPNTGSHSLH 

RLTRDLQMNEAMVALVTDRLQGWNSGEGN 

WDRADKFGDLVDYLRVHSHSAVYASSMSPPI 

AEQ1IRSLKL1MGLDGTTQ 


236 


1586 


A 


2621 


1 


392 


NTSSFPAQPSSPARPSLPHLSQHPSNPLLPLAS 

ADHPQCGRFLPLHEPEPLCPSPSLSYPTLVSS 

WSSPFSSHHGCPPGLYPFPTSPKTIQPPGLAQL 

KMLCIPPGRQQLRGAQSMPGHGALSPLLLPP 

A 


237 


1587 


A 


2628 


398 


1 


DLVCKJSGFGRGPRDRSEAVYTTMSGRSPAL 

WAAPETLQFGHFSSASDVWSFGIIMWEVMAF 

GERPYWDMSGQDVIKAVEDGFRLPPPRNCFN 

LMHRLMLDCWQKDPGERPRFSQIHSILSKMV 

QDPEPPNV 


238 


1588 


A 


2631 


1 

i 


1104 


WSPCSLTCGVGLQTRDVFCSHLLSREMNETV 

ILADELCRQPICPSTVQACNRFNCPPAWYPAQ 

WQPCSRTCGGGVQKREVLCKQRMADGSFLE 

LPETFCSASKPACQQACKKDDCPSEWLLSDW 

TECSTSCGEGTQTRSAICRKMLKTGLSTVVNS 

TLCPPLPFSSSIRPCMLATCARPGRPSTKHSPHI 

AAARKVYIQTRRQRKLHFVGGGFAYLLPKTA 

WLRCPARRVRKPLITWEKDGQHLISSTHVT 

VAPFGYLKJHRLKPSDAGVYTCSAGPAREHF 

VIKL1GGNRKLVARPLSPRSEEEVLAGRKGGP 

KEALQTHKHQNGIFSNGSKAEKRGLAANPGS 

RYDDLVSRLLEQGAPCSSSKKKN 


239 


1589 


A 


2636 


1 


678 


MKPDNILLDEHGHVH1TDFNIAAMLPRETQIT 

TMAGTKPYMAPEMFSSRKGAGYSFAVDWW 

SLGVTAYELLRGRRPYH1RSSTSSKE1VHTFET 

TWTYPS A WSQEM V SLLKKLLEPNPDQRFSQ 

LSDVQNFPYMNDINWDAVFQKRL1PGF1PNK 

GRLNCDPTFELEEMILESKPLHKKKKRLAKK 

EKDMRKCDSSQTCLLQEHLDSVQKEF11INRE 

KVNRDCI 


240 


1590 


A 


2639 


389 


3 


ELLDPTTPMRTKC1ELLYAALTSSSTOQPKAD 
LWQNFAREIEEHVFTLYSKNIKKYKTCIRSKV 
ANLKNPRNSHLQQNLL SGTTSPREFAEMTVM 
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385 



200 



459 



1271 



506 



267 



404 



440 



21 



Amino acid sequence (A^Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
Phenylalanine, OGIycine, H=Histidine, 
I^Isoleucine, K=Lysine, IMxucine, 
M=Methionine, N^Asparagine, P=Proline, 
Qt=Glutamine, R=Arginine, Serine, 
T^Threonme, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 



EMANKELKQLRASYTESC1QEHYLPQVIDGTL 
Y 



1RLTILRCVFMRLAT1CVLVFTLGSK1TSCDDD 
TCDLCGYNQKLYPCWETQVGQEMYKLMIFD 
F1IILAVTLFVDFPRKLLVTYCSSCKUQCWGQ 
QEFA1PDNVLGIVYGQTIC Wl GAPFSPLLPAM 

Y 

YFKNTTLLL VG VIC V AAAVEK\VNLHKRJ AL R 

MVLMAGAKPGMLLLCFMCCTTLLSMWLSNT 

STTAMVMPIVE A VL QELV S AEDEQL V AGNSN 

TEEAEPI SLDVKNS QPS VEL1FVNEDILDFLMK 

SPLMISQAC1 

CLAMIKG1QSSGKI1YFSSLFPYWLICFLIRAF 
LLNGSIDGIRHMFTPKLE1MLEPKVWREAATQ 
VFFALGLGFGGV1AFSSYNKRDNNCHFDAVL 
VSFTNFFTSVLATLVVFAVLGFKANVrNEKCIT 

QNSETV 

MTTTLIGLLKTARLLRLVRVARKLDRYSEYG 

AAVLMLLMCIFALIAHWLACIWYA1GNVERP 

YLTDKIGWLDSLGQQIGKRYNDSDSSSGPSIK 

DKYVTALYFTF SSLTSVGFGNVSPNTNSEKIF 

S1CVMLIGSLMYAS1FGNVSA1IQRLYSGTARY 

HMQMLRVKEFIRFHQIPNPLRQRLEEYFQHA 

WTYTNG1DMNMVTNGTCSSCTSDDGHF1LVS 

NHHQGGLIYSWNDAASMQRPFNH1KSSLLGS 

TSDSNLNKYSTINKIPQLTLNFSEVKTEKKNSS 

PPSSDKTIIAPKVKDRTHNVTEKVTQVLSLGA 

DVLPEYKLQAPRINKFT1LHYSPFKAVWDWLI 

LLLVI YTA1 FTPYSAAFLLNDREEQKRRECGY 

SCSPLNVVDLIVDIMFI1DILINFRTTYVNQNEE 

WSDPASV 

NLTWWPLFRDVSFYIVDLIMLIIFFLDNVIMW 
WESLLLLTAYFCYWFMKFNVQVEKWVKQ 
MINRNKWKVTAPEAQAKPSAARDXDEPTLP 
AKPRLQRGGSSASLHNSLMRNSIFQNK1HTLD 

PHV 

VLVLQMNYYQMLIIYYVLFFKVNEFLAFEGPI 
LLDMRIKHLIKTNQLSQATALAKLCSDHPE1G 
IKGSFKQTYLVCLCTSSPNGKJL1EEVSMFSFIS 

NYFLS 

DAWVKNDIIFNQTERKQK1SENLKHLASVRV 

VQKNL VF W GLSQRLADPEVSPL VFFVIL1FF 

VSLSYLEIIFDPAQLCPSSEHIIS 



DFTTLAAMMRTLFSLFGDVRSDVHRFSVTLF 
GAAIKSVKNPDKKSIENQVLDSLVPLLLYSQD 
ENDAVAEESRQVLT1CAQFLKWKLPREVYSK 
DPWHIKPTEAGTICRFFEKKCKGKINILEQTL 
MYSKNPKL 



FRRRRRRRERDCAAQGARRHCRHLAECKLV 

SFPIGIYKVLRNVSGQIHLITLANNELKSLTSK 

FMTTFSQLRELHLEGNFLHRLPSEVSALQHLK 

AIDLSRNQFQDFPEQLTALPALETINLEENEIV 

DVPVEKLAAMPALRSINL 



404 



LLPGSLGVPILHSQPWDPSPQCPHRAPSTPRRL 
PPLGALSQALTFLSRAAKNHSQDPGKGTKPFP 
AAPAAPPPRSSLPAPLPMGLKDKGPQPAPPTIF 
NSPWHPATLPGALGPQLSQAAPSPIPPPCLMG 
1SSCPDLKLTKSSTP 



FVFDLKLRVPGFAALL1HGASSVPGPETVRLR 
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/=possible nucleotide deletion, \=possible 
nucieotitje insertion. 














QKRKJCKAPDHSSGRKEELVTTHTVDKLJETKK 
PVGRVLCGLSGEIXHSLLLPRRKTEKRALGSH 
RKAGFPEHPVAPEPLSNSCQISKEGREQVLSEI 


252 


1602 


A 


2697 


421 


1 


PQKSH SGA YQCFATRKAQTAQDFAII ALEDG 

T"Dl>T\/OCT?OI7V W/TVTDi^T7/^lTCT \AC y A AVfi ADDDT 

1 JrKJ Vo or ocJVY vlNrOJbV^roljMCAAiVOArrr J 
VTWALDDEPIVRDGSHRTNQYTMSDGTTISH 
MNVTGPQIRDGGVYRCTARNLVGSAEYQARI 
NVRGPPSIRAMRNIT 


253 


1603 


A 


2698 


65 


401 


ACCQWRRTLIPAKSTTVSCTISTPHHPFRGSYS 
FDDHITDSEALSRSSHVFTSHPRMLKRQPAIEL 
PLGGEYSSDVPRPLSTQLSSSLLGYFSTLMTG 
AAFTNNIASSTIIL 


254 


1604 


A 


2699 


438 


301 


GQIHSQDDPPFIDQLGFGVAPGFQTFVACQEQ 
RVRGPWEAGPGVGY 


255 


1605 


A 


2700 


1 


842 


LQNREDSSEGIRKJCLVEAEELEEKHREAQVS 

AQHLEVHLKQKEQHYEEKIKVLDNQIKKDLA 

DKETLENMMQRHEEEAHEKGKJLSEQKAMIN 

AMDSKJRSLEQRIVELSEANKLAANSSLFTQR 

NMKAQEEMISELRQQKFYLETQAGKLEAQN 

RKLEEQLEKISHQDHSDKKRLLELETRLREVS 

LEHEEQKLELKRQLTELQLSLQERESQLTALQ 

AARAALESQLRQAKTELEETTAEAEEEIQALT 

VGLGSNIFRLLKASARMSVELALSILAHP 


256 


1606 


A 


2701 


2 


405 


FVGGPGADPPVAVMWDPRAARMDLTAYAE 

LLKESGNQVLKNGNFSLAIRKYDEAIQILLQL 

YQWGVPPRDLAVLLCNKSNAFFSLGKWNEA 

FVAAKJECLQWDP1TVKGYYRAGYSLLRLHQ 

PYEAARMFFEGLR 


257 


1607 


A 


2702 


2 


399 


FVESASSRPPGCFSGDGRFWLVSEGSRRGWD 
FNPSFSFLDPRY SVGGDENIGTVTTLANILREF 
NPSLKGFSVGTGKETSPNAFLNQAVAGGRAE 
DLPVQARRLVDLMKNDTRIHFQEDWKIITLF1 
GGNDL 


258 


1608 


A 


2709 


1 


1097 


SVGARQGEARDRIRRFFPKGDLEVLQAQVERI 

MTRKELLTVYS SEDGSEEFETIVLKAL VKACG 

SSEASAYLDELRLAVAWNRVDIAQSELFRGDI 

QWRSFHLE ASLMD ALLNDRPEF VRLL I SH GLS 

LGHFLTPMRLAQLYSAAPSNSLIRNLLDQASH 

SAGTKAPALKGGAAELRPPDVGHVLRMLLG 

KMCAPRYPSGGAWDPHPGQGFGESMYLLSD 

KATSPLSLDAGLGQAPWSDLLLWALLLNRA 

QMAMYFWEMGSNAVSSALGACLLLRVMAR 

LEPDAEEAARRKDLAFKFEGMGVDLFGECYR 

SSEVRAARLLLRRCPLWGDATCLQLAMQAD 

ARAFFAQDGVQSLPTQKWWGDMARR 


259 


1609 


A 


2721 


1 


403 


VYLGAGPGLFFSNEGAKEGEKANIPKLMLPR 
GGFSQREMVTGERSPSPEEEEEEEEEGFGERA 
SCRRGLFRVRLTRVGLAAPSKASRGQEGDAA 
r Is. or V KJcl^^r AxKr rK V oLor JvAKo U oOL' ybc. 
GGLRVRLP 


260 


1610 


A 


2728 


1 


477 


LLGGDLRYHLQQKVHFTEGTVKLY1CELALA 
LEYLQRYmiHRDIKPDNILLDEHGHVHITDFN 
IATWKGAERASSMAGTKPYMAPEVFQVYM 
DRGPGYSYPVDWWSLG1TAYELLRGWRPYEI 
HSVTP1DEILNMFKVERVHYSSTWCKGMVAL 
LRK 


261 


1611 


A 


2730 


3 


547 


LTITDFILVLYRYYRSPLVQIYEIEQHK1ETWR 
EIYLQGCFTCPLVSISF^SLFEAVYTLIKNRIH 
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nucleotide insertion 














RLPVLDPVSGNVLHILTHKRLLKFLHIFGSLLP 
RPSFLYRTIQDLGIGTFRDLAWLETAPILTAL 
DIFVDRRVSALAWNECGTHPQDERLGLGW 
GLGEPGSEERLFPAA1TSR 


262 


1612 


A 


2733 


3 


431 


GPEFPGSAKLVFLDLSYNNLTQLGAGAFRSA 

GRLVKLSLANNNLVGVHEDAFETLESLQVLE 

LNDNNLRSLSVAALAALPALRSLRLDGNPWL 

CDCDFAHLFSW1QENASKLPKGLDEIQCSLPM 

ESRRISLRACRRPASRV 


263 


1613 


A 


2736 


2 


343 


PARI SG VDPPVRKATKGGENC SFEDNKNWQF 
LWGLNGNFNFFKEPWGGRNNHAKGFRTTW 
ARSSSQNNRTFQNNRNFLRLQRDSQKKGQFA 
RLI SPL VNLPQSPGGLEFQYQ AT 


264 


1614 


A 


2738 


2 


245 


RAMLKCLREGQPPPSYN WTRLDGPLP SG VRV 
DGDTLGFPPLTTEHSGIYVRHDTNEFSSRDSH 
DTVDVLDPPEDSGKQVDL 


265 


1615 


A 


2752 


2 


388 


AAGDAPLRSLEQANRTRFPFFSDVKGDHRLV 
LAAVETTVLVLIFAVSLLGNVCALVLVARRR 
RRGATACLVLNLFCADLLFISADPLVLAVRWT 
EAWLLGPVACHLLFYVMTLSGSVTILTLAAV 

SLER 


266 


1616 


A 


2755 


192 


1 


AFREVGGYWGLLCEHLYAIPSKTSEGNWTAK 
LQGYLPLQDAFHIFQDPLTGDLPWPELILGLP 

V 


267 


1617 


A 


2760 


434 


714 


ASRLEKQNSTPESDYDNTFNDMEPDGMGYM 
HRTSVPGEGLPRARDLAGLGQQKQFTTHTPF 
LYFQTHKGLKDSSIRSEVTCLGISQCWRKGFF 


268 


1618 


A 


2762 


1 


405 


IACTFCGQDEWSPERSTRCFRRRSRFLAWGEP 
AVLLLLLLLSLALGLVLAALGUVHHRDSPL 
VQASGGPLACFGLVCLGLVCLSVLLFPGQPSP 
ARCLAQQPLSHLPLTGCLSTLFLQAAEIFVESE 

LPLSWAE 


269 


1619 


A 


2772 


3 


243 


TRPAEKIQYLVLFFVMSHPSQAYDKLSLSDHL 
LIAVLNLLRREVSEHGRHLQQYFNLFVMYAN 
LSKNLSFSEFCFDVSY 


270 


1620 


A 


2789 


1 


486 


ELQSQQACTHTKETEQLRSQLQTLKQQHQQA 

VEQIAKAEETHSSLSQELQARLQTVTREKEEL 

LQLSIERGKVLQNKQAEICQLEEKLE1ANEDR 

KHALERFEQEAVAVDSNLRVRELQRKVDGIQ 

KAYDELRLQSEAFKKHSLDLLSKERELNGKL 

RHLSP 


271 


1621 


A 


2795 


1 


568 


KEKRVTVQLPTESIQKNQEDKLKMVPRKQRE 

FSGSDRGKLPGSEEKNQGPSMIGRKEERLITE 

RKHEHLKNKS APK WKQKV1DAHLD S QTQN 

FQQTQIQTAESKAEHKKLPQPYNSLQEEKCLE 

VKGIQEKQVFSNTKDSKQE1TQNKSFFSSVKE 

SQRDDGKGALNIVEFLRKREELHQILSTVKQP 


272 


1622 


A 


2797 


8 


523 


KCMQGKYAGAMESEPCVCTEADFDCDYGYE 

RHSNGQCLPAFWFNPSSLSKDCSLGQSYLNST 

GYRKWSNNCTDGVREQYTAKPQKCPGKAP 

RGLRIVTADGKLTAEQGHNVTLMVQLEEGD 

VQRTLlQVDFGDGIAVSYVNLSbMEDolAH V 

YQNXG1XRXTVQVDNSLGS 


273 


1623 


A 


2801 


72 


395 


HPSRSNVGPRQLTVWNTSNLSHDNRRKY1FS 

DEEGQNQLGIRIHQDIPLPPRRRELPALRTTNG 

KADSLNVSRNSVMQELSELEKQIQVIRQELQL 

AVSRKTELEEYH 


274 


1624 


A 


2805 


168 


320 


ILWLYFETGTWVYPVFAKLSLLGLAALFSLRE 
IFIARNGWGETLTHCKRV 
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D=Aspartic Acid, B=Glutamic Acid, 
F— Phenylalanine G=Grvcine. H—Histidine 
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275 


1625 


A 


2812 


208 


321 


GSLATCQLSEPLLWFILRVLDTSDALKAFHD 
MGKIIFQ 


276 


1626 


A 


2813 


41 


266 


AGRSLHGAGDRAWVG1SPTDWSPKVVELCK 
KYQQQTVV AIDL AGDETIPG S SLLPG H VQA Y 
QVGPVRRNGEAGPG 


111 


1627 


A 


2817 


3 


410 


VLQERLDNFQRKCIQLASSTEGKVDKLLMRN 

LFISYLHTPKHKQHEVLQAMGSILGITGEEME 

PLFQEEHGTATRWMTGWLEGGSKSVPKTPL 

GLNQQPALNGSFSELFVKFLKTESLSSTLPTX 

LPPHNSPGKIK 


278 


1628 


A 


2821 


238 


457 


GLSGPSCSCPHSPLPTIISRAQLETALKWRNYE 
VKLRLLLHLEELQMEHDIRHYDLESVPMTWD 
PVDQNPRLV 


279 


1629 


A 


2822 


342 


1 


PLIPANLPAHSNPLQPLPSLPHPFLPATHKFPT 
TPPTFSSVPPPLPSLSSILHHSPLHSELNPHLOS 
CRLPSRPSVSRELPPQSGPASSVPLAPTPLPDS 
VPSQRHPTXPPPAS 


280 


1630 


A 


2825 


307 


77 


PSMVWSYHWGVKQKRLALCVFSFEEGGRRK 

CGOYWPLEKDSRiRFGFLWTNLTGAVGEPG 

VAFQCDGQRRREPTC 


281 


1631 


A 


2827 


81 


381 


KMGTAVWVPKEKEKRDKASQEGGDVLGAR 
QDCTPS LKSLVATGNLLDLEETAKAPLSTVS A 
KITT>JNTOEVTRPOALSGSSVVWVSGCVASRS 
V1LSLTSG 


282 


1632 


A 


2830 


471 


160 


KLPXDKYELEPSPLTQYJLERKSPHTCWQVFV 
TSSGKYNELGYPFGYLKASTrLTCVNLFVMP 
YNYPVLLPLLDDLFKVHKLKPNLKWRQAFDS 
YLKTLPPYYL 

A. Aw* A A -4 A U 


283 


1633 


A 


2835 

- 


462 


148 


VSPALSLTPTIFSYSPSPGLSPFTSSSCFSFNPEE 
MKHYLHSQACSVFNYHLSPRTFPRYPGLMVP 
PLOCOMHPEESTOFSIKLOPPPVGRKNRFRVE 
SSEESAP 


284 


1634 


A 


2836 


2 


384 


KTLPRTLLD1LADGTILKVGVGCSEDASKLLQ 
D YGLV VRGCLDLR YL AMRQRNNLLCN GLSL 
KSLAETVLNFPLDKSLLLRCSNWDAETLTED 
QVIYAARDAQISVALFLHLLGYPFSRNSPGEK 
KR 


285 


1635 


A 


2843 


20 


271 


PIRPYYSYSGLDRDCSWLPLAKAWLPDVMIL 
VCDRVSEDGINRQQAQEWCIKHGFELVELSP 
EELPEEDGKCLCVRRKYGTYI 


286 


1636 


A 


2845 


197 


278 


TAEDVLTVAYEHGVNLFDTAEVYAAGK 


287 


1637 


A 


2851 


2 


427 


FVAEVRREWAXYMEVHEKASFTNSELHRAM 
NLHVGNLRLLSGPLDQVRAALPTPALSPKDK 
AVLQNLKRILAKVQEMRDQRVSLEQQLRELI 
QKDD1TGSLV ri'DHSQMKKLFEEQLKKYDQL 
KVYLEQNLAAQDRVLCALT 


288 


1638 


A 


2859 


2 


469 


FVNLG1LTCIECSGIHREMGAHISRJOSLELDK 

LGTSELLPAKNVGNNSFNDIMEANLPSPSPKP 

TPSSDMTVRKEYITAKYVDHRFSRKTCSTSSA 

KLNELLEAJKSRDLLALIQVYAEGVELMEPLL 

EPGQELAETALHLAVRTADQTSLHLVE 


289 


1639 


A 


2861 


2 


454 


FVASGGPATARMSDSQFFCVAEERSGHCAW 

DGNFLYVWGGYVSlEDNEVYLPNDEIWTYDr 

DSGLWRMHLMEGELPASMSGSCGACINGKL 

YIFGGYDDKGYSNRLYFVNLRTRDETYIWEK 

ITDFEGQPPTPRDKLSCWVYKDRLIYFG 


290 


1640 


A 


2868 


1 


378 


FRQGQL YKVFLHGSQGQWH SQQ VGPPGSA1 
SPDLLLDSSGSHLYVLTAHQVDRIPVAACPQF 
PDCASCLQAQDPLCG WCVLQGRCTRKGQCG 
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Amino acid sequence (A- Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGlycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, 
M=Methioninc, N=Asparagine, P=Proline, 
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Y=Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possioie 
nucleotide insertion 














RAGQLNQWLWSYEEDSHCLHIQSLLPGHHPR 

QE 


291 


1641 


A 


2870 


1 


385 


FRYMPNNRQQLLRKRHIGNDIVTIVFQEPGAL 

PFTPKSIRSHFQHVFVIVKVHNPCTENVCYSV 

GVSRSKDVPPFGPPIPKGVTFPKSAVFRDFLL 

AKVINAENAAHKSEKFRAMATRTRQEYLKD 

LA 


292 


1642 


A 


2877 


3 


188 


RPTRPPPATTQSPESTMDTSLKKEKSAILDLY1 
PPPPAVPYSPRYVAVHCHGMLVSCWCHL 


293 


1643 


A 


2878 


1 


427 


REKEEEVEEEEDKVVKETEiCEA EQEKE EDSL 
GAGTHPDAAIPSGERTCGSEGSRSVLDLVNYF 
L SPEKLT AENRY YCES CASLQD AEK VVELSQ 
GPCYLILTLLRFSFDLRTMRRRKILDDVSIPLL 
LRLPLAGGRGQAYDL 


294 


1644 


A 


2879 


109 


245 


QLCCFCFRQTTLIVYILSFIGMVIFTFTLDLRYI 
HVFVTGGVLG 


295 


1645 


A 


2880 


3 


320 


LASSQHGILNNLSLLFS1CKTCIRTMDHHCPRA 
NNCVGEQNHRFFCALHCKSKHFCIEFTLNTNF 
FNCFLPGAEKSTIDAPFSLQPFLQDSKYNTALS 
LSES1SQ 


296 


1646 


A 


2892 


209 


363 


SQYSHSLDYHLLQVTKNPFTLGDSSNPGQTE 
RLQEFSQKMDQVRGHWPVST 


297 


1647 


A 


2893 


8 


424 


SPXTLXLDTFILLGIQDNILVLILATPPFMAGG 
KLY STMGRFLRDRKNPACREMA WLLANLA 
QGDSLAARAIAVQKGSIGHLLGFLEDSLAAT 
QIQQSQASLLHMHNPPFEPTSVDMMRRACRA 
LLALAKVDDNHSEF 


298 


1648 


A 


2894 


310 


445 


FWIYFPSFFMTGYLPLGFEFAVEITYPESEGTS 
SGLLNASAQVNL 


299 


1649 


A 


2898 


1 


492 


KIKAKNLTNYDLCSIFLGTSTLLVWVGVIRYL 

GYFQAYNVLILTMQASLPKVLRFCACAGMIY 

LGYTFCGWIVLGPYHDKFENLNTVAECLFSL 

VNGDDMFATFAQIQQKSILVWLFSRLYLYSFI 

SLFIYMILSLFIALITDSYDTIKKFQQNGFPETD 

LQEF 


300 


1650 


A 


2901 


1 


445 


PVWWNSLNGASEVTFSVHVKDGGSFPKTDST 

TVTVRFVNKADFPKVRAKEQTFMFPENQPVS 

SLVTTITGSSLRGEPMSYYIASGNLGNTFQ1DQ 

LTGQVS1SQPLDFEKJQKYVVWIEARDGGVPP 

FSSYEKLDITVLDVNDNAPIF 


301 


1651 


A 


2902 


162 


433 


THFICLPLGYCFPLLDKDLQLPSGFNCNFDFLE 
EPCG WMYDHAKWLRTTWASSS SPNDRTFPG 
KPAVSEDMKELRPACSTYFNPRFPYKL 


302 


1652 


A 


2909 


2 


412 


GPQML CKK1YFI W VTRS QCQ FE WLADIM QE V 

EENDHQDLVSVHIYVTQLAEKFDLRTTMLYI 

CERHFQKVLNRSLFTGLRSITHFGRPPFEPFFN 

SLQEVHPQVRKIGVFSCGPPGMTKNVEKACQ 

LVNRQDRAHFM 


■ ■ — 

303 


1653 


A 


2914 


291 


453 


KLNRWLCFFYSWSFGILLYEMVTLGAPPYPE 
VPPTSILEHLQRRKIMKRPSSCS 


304 


1654 


— 1 

A 


2926 


179 


354 


PGVPSQALRKAESLKKCLSVMEAKVKAQTAP 
NKDVQREIADLGEVGAASLPPSSGPGA 


305 


1655 


A 


2938 


135 


438 


GMGYLHAKGILHKDLKSKNVFYDNGKVVIT 
DFGLFSISGVLQAGRREDKLRIQNGWLCHLA 
PEIIRQLSPDTEEDKLPFSKHSD VFALGTI WYE 
LHAREWP 


306 


1656 


A 


2944 


2 


329 


VRWNSCVNCSCAFGNGASLSTSLGESSGCLW 
EIGKWLSCSLLSFPSPLAVLIITFCIVTVLGREA 
LTKGALWAVFLLAGSALLCAEVTGVIWRQPE 
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/^possible nucleotide deletion, Y=possib!e 
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SKTKLSFKVSSSA 




1 

IOj / 


A 




Z 


A 1 1 


xrVT CI A fMQ A (~ZQ A UflYTD 1 \/Vf~*\/Dt>\/lT?XI/"' I 
IN i JjL'IAJSJN oAuoAMuK I KL* V V^Vrr VJXriNUJL 

PDLSTTEGSHAFLPCKARGSPEPN1TWDKDGQ 

CTAENAVGRARRRVHLTILVLPVFl 1 LPGDRS 
LRLGDRLWLR 


308 


1658 


A 


2951 


1 


407 


PTRPPRVRFDNEFDAESQRKRTTSVSKMERM 
DSSLPEEEEDEDKEAINGSGNAENRERHSESS 

T^U/\/fV"T\/PQVXI/^T^QQ\>fT^PT?XlVTVilAyn?r^T7TT T7T> 
JJ WMIV 1 Vro IJNl^l f>OoJVUJrJ\i\l I MIVJKiJc, 1 Lftr 

LPKNWEiMAYTOTGMlYFlDHNTKTTlWLDP 
RLCKKAKAPEDC 


309 


1659 


A 


2954 


2 


179 


QDFLTLTLTEPTGLLYVGAREALFAFSMEALE 
LQGAVRGGAVGGSRACQRARPRGAVLG 


310 


1660 


A 


2959 


1 


419 


QDMMERAIIDTFVGHDVVEPGSYVQMFPYPC 
YTRDDFLr VIEHMMPLCMVIS W VYS VAMT1Q 
HIVAEKEHRLKEVMKTMGLNNAVHWVAWFI 
TGFVQLS1S VTALTA1LKYGQVLMH SHWI1W 
LFLAV Y AV ATIMFCF 


311 

■ 


1661 


A 


2963 


3 


465 


MKPQMPGLGAPNGYGPGRGRAGVPGGPERR 

PWWHLLPFSSPGYLGVMKAQKPGAGEGMK 

PQKPGLRGTLKPQKSGHGHENGPWPGPCNA 

RVAPMLLPRLPTPGVPSDKEGGWGLKSQPPS 

AVQNGKLPGHQPPNGYGPGAEPGFNGGLEPQ 

KI 


312 


1662 


A 


2967 


3 


405 


WLAQEWSPCTVTCGQGLRYRVVLCIDHRGM 

HTGGCSPKTKPHIKEEC1VPTPCYKPKEKLPV 

EAKLPWFKQAQELEEGAAVSEEPSFIPEAWS 

ACTVTCGVGTQVR1VRCQVLLSFSQSVADLPI 

DECEGPKPA 


313 


1663 


A 


2969 


2 


430 


VVADNCRQGYLDALRPLERRGLTlCEPVLWT 

LVSKEPPAPADGKWDAGCDQRRKGGLSLNW 

KVPHVQVKDVPNFEQLSPELEAALKKACTRD 

PSRWARFWHSGPGQVLTYLLLPCTLPFEYIYF 

RSRRLWWLPDVPADLWWMQ 


314 


1664 


A 


2971 


422 


33 


LDXSHMALQRLRPGWLAPLFQLRALHLDHNE 
LDALuKOVr VNAoGL-KLLDLboN J LKALGRH 
DLDGLGAJLEKLLLFNNRLVHLDEHAFHGLRA 
LSHLYLGCNELASFSFDHLHGLSATHLLTLDL 
SSNRM 


315 


1665 


A 


2973 


1 


525 


ITVSTHASGSPFGLEPQSGWLWVRAALDREA 
l^bLYll>JtVMAVoubJKJ\JlLuvy ICjI Al VKVol 
LNQNEHSPRLSEDPTFLAVAENQPPGTSVGRV 
FATDRDSGPNGRLTYSLQQLSEDSKAFRIHPQ 
TGEVTTLQTLDREQQS S YQLE VQ VQDGGSPP 
RSTTGTVHVAVLDLNDNT 


ilo 


1 £££ 

- 


A 




2 


400 


ELWELVSAGKoGPERNi ilSVQVVl GNVPKA 
GTDANVYLTIYGEEYGDTGERPLKKSDKSNK 

FEQGQTDTFTIYAIDLGALTKIRIRHDNTGNR 

Ar.iuci nDTniTTAAixrciTwr D/^/^D\in A\rctr 
AuWrL-UKJiJl llJMiSINiill Y irrV-V^KWLAVE-Jb. 


317 


1667 


A 


2981 


3 


440 


VLNCQGRPTRPVRINGDGQEVLYLAESDNVR 
LGCPYVLDPDDYGPNGLDIEWMQVNSNPAH 
HRENVFLSYQDKRINHGSLPHLQHRVRFAAS 
DPSQYDASINLMNLQVSDTATYECRVKKTTM 
ATRK VI VTVQ ARP A VPMC WTEGQ 


318 


1668 


A 


2995 


119 


414 


LPEKEFPIIRKSSSLKVTKCLFTEQPKPI11LRFA 
ENYDARLLRIDIANTLREQVQELFNKTYGKQ 
RRTTGEGHVAAVDREVAGFPVPAEGISGETIH 


319 


1669 


A 


2999 


2 


332 


GFFAYTYGRLVVVEDLHSGAQQHAVSGHSAEI 
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/^possible nucleotide deletion, \=possible 
nucleotide insertion 














STLAL SH SAQVLASASGRSSTTAHCQIRVWD 
VSGGLCQHLIFPHSTTVLALAFSPDDRLLVTL 
GDHDGRTLALWGTGHL 


320 


1670 


A 


3000 


693 


322 


IDESTGLIITVNYLDYETKTSYMMNVSATDQA 
PPFNQGFCSVYITLLNELDEAVQFSNASYEAA 
1LENLALGTEIVRVQAYSIDNLNQITYRFDAY 
TSTQAKALFKIDAITVRGWGQGAPFFPI 


321 


1671 


A 


3001 


6 


383 


RTPRGKACXTVLGRSTGELEGFASSRLPPQPC 
GWGQSSDLLSRIDLDELMKKDEPPLDFPDTLE 
GFEYAFNEKGQLRH1KTGEPFVFNYREHLHR 
WNQKRYEALGEDTKYVYELLEKDCNSKKVS 


322 


1672 


A 


3007 


192 


447 


ERVRNSLFPGRGDSQCACCPSSPVWVFLETGF 

LFPWLFLQVEVIKKAYMQGEVEFEDGENGK 

DGAASPRNVGHNIYILAHQLARH 


323 


1673 


A 


3019 


18 


245 


KELLFYHLIVNNINFFNTRYAKIHIPIIASVSEH 
QPTT W VSFFFDLHIL VCTFPAGLWFCIKNIND 
ERVFGKRGF 


324 


1674 


A 


3020 


523 


797 


LCYFSARYHQRKIFGILYIFTLSAINRKEPNLFI 
YLFIFFEMESHSVTHAGVQRHNLNSLQPLPPG 
FKRF SCLCFLS S WN Y RG APPGPANF 


325 


1675 


A 


3022 


2 


156 


NDFLPLYFGWVLTKKSSETLRKAGQVFLEEL 
GNHKAFKXELRQCRWQVGAL 


326 


1676 


A 


3023 


38 


172 


KMVRGSKKL1SFFPGGPYGILAGRDPSKGLAT 
FCLNKEALKDEFE 


327 


1677 


A 


3027 


1 


385 


LTLEFLLLPAASELAHGKRLACCIVDHKLPEC 
GFYGLYDK1LLFKHDPTSANLLQLVRSSGDIQ 
EGDLVEVVLSASATFEDFQIRPHALTVHSYRA 
PAFCDHCGEMLFGLVRQGLKCDGCGLNYHK 

RC 


328 


1678 


A 


3030 


13 


569 


ITRPTI SCQRPGPGL AAGMLPYTVNFKVS ART 

LTGALNAHNKAAVDWGWQGL1AYGCHSLV 

WIDSITAQTLQVLEKHFCADVVKVKWAREN 

YHHNIGSPYCLRLASADVNGKIIVWDVAAGV 

AQCEIQEHAKP1QDVQWLWNQDASRDLLLAI 

HPPNYIVLWNADTGTKLWKKSYADNILSFSF 

D 


329 


1679 


A 


3038 


90 


744 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAED 

GN1EYKKLVNPSQYRFEHLVTQMKWRLQEG 

RGEAVYQIGVEDNGLLVGLAEEEMRASLKTL 

HRMAEKVGADITVLREREVDYDSDMPRKITE 

VLVRKVPDNQQFLDLRVAVLGNVDSGKSTL 

LGVLTQGELDNGRGRARLNLFRHLHEIQSGR 

TSSISFEILGFNSKGEVHGINGTQWGQTLRMG 

W 


330 


1680 


A 


3040 


3 


397 


LCSTLLLLTIPSWVLSQITLKESGPTLMKPTET 

LTLTCTFSGFSLNTSGVGVAWIRQPPGKALE 

WLAL1YWDDDKKYSPSLNDRLTIAKDTSRNQ 

VVLTMTNMGPVDTATYYCAQFARGARGSN 

WFDPWGQ 


331 


1681 


A 


3043 


3 


1509 


AGIRHE APPTTSNRHRRQIDRGVTHLNI SGLK 

MPRGI AID W V AGNV YWTDSGRD VIE V AQMK 

GENRKTLISGMIDEPHAIWDPLRGTMYW SD 

WGNHPK1ETAAMDGTLRETLVQDNIQWPTG 

LAVDYHNERLYWADAKLSVIGSIRLNGTDPI 

VAADSKRGLSHPFSIDVFEDYIYGVTYINNRV 

FKIHKFGHSPLVNLTGGLSHASDVVLYHQHK 

QPEVTNPCDRKKCEWLCLLSPSGPVCTCPNG 

KRLDNGTCVPVPSPTPPPDAPRPGTCNLQCFN 

GGSCFLNARRQPKCRCQPRYTGDKCELDQC 
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Amino acid seauence fA~AIanine C=Cvsteine 
D^Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, L-Leucine, 
M=Methionine, N-Asparagine, P^proline, 
Q=Glutamme, R^Argtnine, S=Serine > 
T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, * sss Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














WEHCRNGGTCAASPSGMPTCRCPTGFTGPKC 

TQQVCAGYCANNSTCTVNQGNQPQCRCLPG 

FLGDRCQYRQCSGYCENFGTCQMAADGSRQ 

CRCTAYFEGSRCEVNKCSRCLEGACWNKQS 

GDVTCNCTDGRVAPSCLTCV GHCSNGGSCT 

MNSKMMPECQCPPHMTGPRCEEHVFSQQQP 

GHIASDLIP 


332 


1682 


A 


3045 


3 


952 


TTTISNFHTQVNRTYCCGTYRAGPMRQISLVG 

AVDEEVGDYFPEFLDMLEESPFLKMTLPWGT 

LSSLRLQCRSQSDDGPIMWVRPGEQMIPTAD 

MPKSPFKRRRSMNEIKNLQYLPRTSEPREVLF 

EDRTRAHADHVGQGFDWQSTAAVGVLKAV 

OFGEWSDOPRITKDVICFHAEDFTDWORLO 

LDLHEPPVSQCVQWVDEAKLNQMRREGIRY 

ARIQLCDNDIYHPRNVIHQFKTVSAVCSLAW 

HIRLKOYHPVVEATONTESNSNMDCGLTGKR 

ELEVDSQCVRIKTESEEACTEIQLLTTASSSFP 

PASE 


333 


1683 


A 


3046 


497 


167 


SACSTGPELPGRATRSLTRPANQKGCDGDRL 
YYDGCAMIAMNGSVFAQGSQFSLDDVEVLT 
ATLDLED VRS YRAEIS SRNLAVSAPVDTC VG 
CSSKTWKVAPFVRAWWRP 


334 


1684 


A 


3053 


37 


276 


VITDLEEQLNQLTEDNAELNNQNFYLSKQLD 
EASGANDEIVQLRSEVDHLRREITEREMQLTS 
QKQVRRVNKVVRSLEDF 






A 


3054 






wr>AWfinw < 5nr ,< sRTPorTfrA ^v^r rrpi Tern 

NCEGQNIRYKTCSNHDCPPDAEDFRAQQCSA 

YNDVQYQGHYYEWLPRYNDPAAPCALKCH 

AQGQNL VVELAPKVLDGTRCNTDSLDMCIS G 

1 COAVGCDROL G SNA KEDNCG VCA GDGS TC 

RLVRGQSKSHVSPEKREENV1AVPLGSRSVRI 

TVKGPAHLFIESKTLQGSKGEHSFNSPGVFVV 

ENTTVEFQRGSERQTFKIPGPLMADFIFKTRY 

TAAKDSWQFFFYQPISHQWRQTDFFPCTVT 

CGGG 


336 


1686 


A 


3058 


54 


347 


WGKQEAGAHSDSCCLLHTTPRLTPAHSRKA 
LRNSRIVSQKDDVHVCIMCLRAIMNYQVSRG 
AWDWRLGSPACPHWGLHKLPRLWDPLSLYP 
VLCWGT 


337 


1687 


A 


3059 


2 


709 


ILTSLVELTRFETLTPRFSATVPPCWVEVQQE 

QQQRRHPQHLHQQHHGDAAQHTRTWKLQT 

DSNSWDEFrVFELVLPKACMVGFIVDFKFVLN 

SNTTNIPQIQVTLLKNKAPGLGKVNGLRLCPF 

LEDHKEDILCGPVWLASGLDLSGHAGMLTLT 

SPKLVKGMAGGKYRSFL1HVKAVNERGTEEI 

CNGGMRPWRLPSLKHQSNKGYSLASLLAK 

VAAGKEKSSNVKNENTSGTRK 


338 


1688 


A 


3060 


85 


384 


KAFYNYHVLELLQMLVTGGVSSQLEQHLDK 
DKVYGVADSCTSLLSGRNRCKLGLLSLHETIL 
SDVNPRNTFGQLFCG SLDLFGILC VGLYRIIDE 
EELNP 


339 


1689, 


A 


3063 


236 


362 


CFLCLSGDFMVMTIFFNVSRRFGYVAFQNYV 
PSSVTTMLSWV 


340 


1690 


A 


3065 


3 


1249 


DLWQFTPLHEAASKNRVEVCSLLLSYGADPT 

LLNCHNKSA1DLAPTPQLKERLAYEFKGHSLL 

QAAREADVTRJKKHLSLEMVNFKHPQTHETA 

LHCAAASPYPKRKQ1CELLLRKGANINEKTKE 

FLTPLHVASEKAHNDVVEVVVKHEAKVNAL 

DNLGQTSLHRAAYCGHLQTCRLLLSYGCDPN 
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to last amino 
acid residue 
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Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, B=GIutamic Acid, 
F=Phenylalanine, G=G!ycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Argimne, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














USLQGFTALQMGNENVQQLLQEGISLGNSEA 

DRQLLEAAKAGDVETVKKLCTVQSVNCRDIE 

GRQSTPLHFAAGYNRVSWEYLLQHGADVH 

AKDKGGLVPLHNACSYGHYEVAELLVKHGA 

WNVADLWKFTPLHEAAAKGKYEICKLLLQ 

HGADPTKKNRDGNTPLDLVKDGDTDIQDLLR 

GDAALLDAAKKGCLARVKKLSSPDNVNCRD 

TQGRHSTPLHLAGK 


341 


1691 


A 


3070 


1 


547 


GVLIPSFQNQLFADILAG1ESVTSEHNYQTLIA 

NYNYDRDSEEESVINLLSYNIDGIILSEKYHTI 

RTVKFLRSATIPWELMDVQGERLDMEVGFD 

NRQAAFDMVCTMLEKRVRHKJLYLGSKDDT 

RDEQRYQGYCDAMMLHNLSPLRMNPRAISSI 

HLRMQLMRDALSANPDLDGVFCTN 


342 


1692 


A 


3073 


463 


3 


RINRCRKPSDADILVPGDTISLIGTTSLRIDYNE 

IDDNRVTAEEVDILLREGEKLAPVMAKTR1LR 

AYSGVRPLVASDDDPSGRNVSRGIVLLDHAE 

RDGLDGFITITGGKLMTYRLMAEWATDAVC 

RKLGNTRPCTTADLALPGSQEPAKVP 


343 


1693 


A 


3075 


250 


1 


LLI YLA1FAPVAM SALAGVKSVQQVRJRAAQS 

LGASRAQVLWFV1LPGALPEILTGLRJGLGVG 

WSTLVAAELIAATRGLGFM 


344 


1694 


A 


3076 


2 


138 


LYFDAYLQSLQVAAISTFCCLLIGYPLAWAV 
AHSKPSTRN1LLLL 


345 


1695 


A 


3078 


469 


3 


LKIRGQRIELGEIDRVMQALPDVEQAVTHAC 
VINQ AA ATGGD ARQLVGYLV SQSGLPLDTSA 
LQAQLRETLPPHMVPWLLQLPQLPLIANGKL 
DRKALPLPELKAQAPGRAPKAGSETIIAAAFS 
SLLGCDVQDADADFFALGGHSLLAMKLAT 


346 


1696 


A 


3082 


404 


2 


QNITSKDLDVRLDPQTVPIELEQLVLSFNHMI 

ER1EDVFTRQSNFSADIAHEIRTPITNLITQTEI 

ALSQSRSQKELEDVLYSNLEELTRMAKMVSD 

MLFLAQADNNQLIPEKKMLNLAPIEVGKVFD 

QFEALPE 


347 


1697 


A 


3084 


3 


340 


NELTFKEAEISKLYTKVHPAYRTLLEKRQALE 
DEKAKLNGRVTAMPKTQQEIVRLTRDVESGQ 
QVYMQLLNKEQELKTTEASTVGDVR1VDPAIT 
QPGVLKPKKGL1ILGAI 


348 


1698 


A 


3086 


723 


10 


TQAMVWQQKACAEDDPQLSGRHWLHAATL 

YNIAAYPHLKGDDLAEQAQALSNRAYEEAA 

QRLPGTMRQMEFTVPGGAPITGFLHMPKGDG 

PFPTVLMCGGLDAMQTDYYSLYERYFAPRGI 

AMLTIDMPSVGFSSKWKLTQDSSLLHQHVLK 

ALPNWWVDHTRVAAFGFRFG ANV AVRLA Y 

LESPRLKAVACLGPVVHTLLSGLKCQQQVPE 

MYLDVLASRLGMHDASTKSSTRENH 


349 


1699 


A 


3087 


2 


249 


RJ RS SDPEITLAGTPLH AA YL1 GMTLIC AGFSV 
GFGVAMSQALGPFSLRAGVASSTLGIAQVCG 
SSLWIWLAAWGIGAWNM 


350 


1700 


A 


3099 


3 


424 


EAPEATPQPSQPGPSSPISLSAEEENAEGEVSR 

ANTPDSDITEKTEDSSVPETPDNERKASISYFK 

NQRGIQYIDLSSDSEDVVSPNCSNTVQEKTFN 

KDTVIIVSEPSEDEESQGLPTMARKNDDISELE 

DLSGMEDLK 


351 


1701 


A 


3108 


2 


404 


" HCKNH1IGYQLLHRRALFEKRTRLSD YALIFG 
MFGIWMVIETELSWGAYYKAPLYSLALKCL 
1SLFTI1LLGLTIVYHAREIQLFMANYGADDWR 
SALTYEPIFLILLEALRGVIHATPCRVSLSLWD 

GLDLP 
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seq- 
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ID NO: 
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nucleotide 
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correspondi 

ng to first 

amino acid 
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nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid seauence fA~ Alanine C = Cvsteine 
D=Aspartic Acid, E^Glutamic Acid, 
F=Pheny]alanine, G=Glycine, H=Histidine, 
I=Isoleucine., K^Lysine, L=Leucine, 
M=Methionine 3 N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, Valine, W^Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/possible nucleotide deletion, V=possible 
nucleotide insertion 


352 


1702 


A 


3110 


341 


2 


AQLAEVCPPQTLLTTNTSSISITAIAAEIKNPER 
VAGLHFFNPAPVMKL VE WSGLATAAE WE 
QLCELTLSWGKQPVRCHSTPGFIVNRVARPY 
YSEAWRALEEQVAAPEVI 


353 


1703 


A 


3111 


3 


18$ 


HFSLFRIAFAVFLTYMTVGLPLPVIPLFVHHEL 
GYGNTMVGIAVGIQFLATVLTRGYAGRLA 


354 


1704 


A 


3116 


367 


225 


WQLFHLNGTFLNIGETDTESCVNGWVYDRSS 
FPFSNMTEVRGLVFLS 


355 


1705 


A 


3117 


101 


53 


VINLVY1JSSPRPELKPVDKESEVVMKFPDGF 

EKFSPP1LQLDEVDFYYDPKHVIFSRLSVSADL 

ESRICVVGENGAGKSTMLKLLLGDUAPVRGI 

RHAHRNLKIGYFSQHHVGAAGT*TFSACGNL 

LGTQVFLGRPEEEYXRHQLGFGMGISGELGHA 

SSLPACLGGQKEAEVAFCSDGLLPCPNFL\IL\ 

DEPTO\HLGHGRAIEALGPOLQTISGVGVILVS 

HE*SALSRJLVCRE\LWVC*GRSTSPF 


356 


1706 


A 


3121 


137 


466 


RGGRJDWGEHNQRLEEHQARAWQGAMDAG 
AASREHARWQGTGLAPGTRVAVAPTCVQGL 
PQERS VCRPFF SSRWREGPVWALGAG AHGKP 
RWSGGVRCVVRGGRWFTPAPH 


357 


1707 


A 


3124 


1249 


229 


MLEAPGPSDGCELSNPSASRVSCAGQMLEVQ 
PGI YPfifi>4 A AVAFPDT-n RFAniTAVT TV/H^F 

r \JMj I F VJvX/Vrxrt V nU Il_»i\_C/l VJ.I 1 /\ V LI V UjEt 

EPSFKAGPGVEDLWRLFVPALDKPETDLLSH 

LDRC VAFIG QARAEGRAVLVHCHAGVSRSV 

AIITAFLMKTDQLPFEKAYEKLQ1LKPEAKMN 

EGFEWQLKLYQAMGYEVDTSSAIYKQYRLQ 

KVTEKYPELQNLPQELFAVDPTTVSQGLKDE 

VLYKCRKCRRSLFRSSSILDHREGSGPIAFAH 

KRMTPSSMLTTGRQAQCTSYFIEPVQWMESA 

LLG V MDGOLLCPKCS AKLG SFNW YGEOCf?C 

GRWrTPAFQIHKNRVDEMKILPVLGSQTGKI 


358 


1708 


A 


3127 


816 


139 


EVETLGPRTPGP/EAQSPTPGSCPGWQEPSPGP 
TPPP* LSGPGPOG AP VL GKLL PDPEETPAGKTP 
LGKHFWWGL\PVTSANFSPGAAA*FGGALSPP 
GGDL/GHMLLQGPPSPFRLQQQ* QTPPGSHSP 
PTANREINPGPAAAADTRSCWG11KRSWRGW 
RGLAPWRLGFGSPGIP*PAPAGIP/GRPTWEGG 
KGAGGKPSETLTRSPPVWRGKRGSANGFLSW 
VQILQ 


359 


1709 


A 


3132 


3 


191 


HEHLLLLLLCVFLVKSQGVNDNEEGFFSARG 
HRPLDKKREDAPNLRPALADVITVCDYRAQIA 
*AASTPKRAASIAHNAVSCR*AQIA 


360 


1710 


A 


3134 


1 


286 


REPPRPALLFF+DRVSLCCPGWNAWQSQLT 
AAPTSQVQ/SDSPTFPSSWDYRHVPEYPANFL 
* RQGFPMLPRLVSNS WAQT VHPPRPPKVLDL 
QA 


361 


1711 


A 


3135 


56 


1449 


PVPAPRVSPSARGAPGRPRLPGVRGPRHSAVA 

AD*RGSRM/PPRAPAPSPTGP/APGGKKVRGR 

VPEDPDAYEPRCSAL*V*PTHVTSFQFCDP*N 

GQIRSYFTVLLRGLNETMLVK/PLCRREP/PEA 

GPGRQSTPAVTRDHRQHEDPRGAGRQWDAD 

PRPSAP/PAEVATGSRPGRHMWMRLCLAAQQ 

APGLPHRTSIRPGWRRLTEPEAWARRHKRPW 

GQRGAVRPPPQGAAPPPSHQGRRTNTDPSAT 

PRLT VM SRCLAPDLKAPASGPRGWRRGMPQ 

SS/GALLWTPPPTPRGSHSPRPREAPLRAIHPA 

GPSK/SRAGASGRLPEVIYGWVTLFTPPEAGT 

F/LIPSPT*MSPALVIQPPVPPTQMGLR1SGLPR 

QG*PSGAPW*LPGLAQLAFQCIILPHDEVGPP 
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09/496 
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Amino acid sequence (A^Alanine C=Cysteine, 
D=Aspartic Acid, E^Glutamic Acid, 
F^henylalanine, OGlycine, H=Histldine, 
I-lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/possible nucleotide deletion, \=possible 
nucleotide insertion 














RNQSPLGNDTLSSGLPMGPRRQVWPLARVG 
GHSSPREPQVLKKPLWGQTDIAGVGSASLYP 

DNL 


362 


1712 


A 


3136 


1270 


274 


RVGMVLGTREVGDSTPPPSPPLYPFTGNEFVQ 

HNTWQLSRVYPSDLRTDSSNYNPQELWNAG 

CQM/V* GGSRDWEEG VEEQQVGNKFSSDGR 

VGECSRKLLG* EMLS VDITSRYRAPSTYLLNS 

LKEGLEGLHGESCSSFLLGPSVAMNMQTAGL 

EMDICDGHFRQNGGCGYVLKPDFLRDIQSSF 

HPEKPISPFKAQTLLNQV1SVQQLPKVDKTKE 

GSI VDPLVKVQI FGVRLDTARQETNYVENNG 

FNPYWGQTLCFRVLGPDFPMLRFGKMDYDW 

KSRNDLLGKTPCPGTCMQQGYRHIHLLSKDG 

I SLRPASIFV YICIQEGLEGDES 


363 


1713 


C 


3139 


60 


248 


MFAGSYGKSMFSFSKKVLNCLPKWRYHFVIA 
PAMNESPLAPHLHQHLVFSVFQVLTILIG V* * 


364 


1714 


A 


3140 


57 


418 


SAFKTLQLPAFSLYFDLGSLKLLILRIHTSIVK 
NHKVESPRTMSPG*DPQSFLQEPQPRPPQLRV 
GLTSGLIQHFHSPSSCQFPLLRGPPFPRQPPLGI 
SGASLCPVLSPPR*PLQPSSL 


365 


1715 


A 


3145 


122 


413 


LLPYPSLFVFLRQCHFVTARLECNGVVSAHCN 
LHLPGSSDSPASAS*VAGTTGVCHHTRLIF\VF 
LV*TGFH YVAQAGLELLTA* S\PPQLPKVVGL 
QA 


366 


1716 


A 


3150 


247 


2 


VGEKLHDIRFGNDFDMTPKAQATKEKIDKLN 

FIKIKKLCIEGYY/NREPQNGRKIFANYVS\DK 

GLMATIYEELLKLSNKLIQ 


367 


1717 


A 


3152 


3 


2367 

i 

t 


"QKLKQNQPKRAHVEDGGSRSKQGNEQSKKT 
PIEKSDFAAATHPRAFYLSKPDETPNAWMSD 
SGTGLTYWKLEEKDMHHSLPETLEKTFISLSS 
TDVSPNQVLTLDPTLHMKPKQQISGIQPHGLP 
NALDDRISFSPDSVLEPSMSSPSD1DSFSQASN 
VTSQLPGFPKYPSHTKASPVDSWKNQTFQNE 
SRTSSTFPSVYTITSNDISVNTVDEENTVMVAS 
ASVSQSQLPGTANSVPECISLTSLEDPVILSKIR 
QNLKEKJ1ARHIADLRAYYESEINSLKQKLEA 
ICE! SG VEDWK1TNQILVDRCGQLDSALHE ATS 
RVRTLENKNNLLEIEVNDLRERFSAASSASKI 
LQER1EEMRTSSKEKDNTIIRLKSRLQDLEEAF 
ENAYKLSDDKEAQLKQENKMFQDLLGEYES 
LGKEHRRVKDALNTTENKLLDAYTQISDLKR 
MISKLEAQVKQVEHENMLSLRHNSR1HVRPS 
RANTLATSDVSRRKWL1PGAEYSIFTGQPLDT 
QDSNVDNQLEETCSLGHRSPLEKDSSP/GSSST 
SLLIKKQRETSDTPIMRALKELDEGK1FKNWG 
TQTEKEDTSN SLL*/INPRQTETS VNASRSPEK 
CAQQRQKRLNSASQRSSSLPPSNRKSSTPTKR 
E1MLTPVTVAYSPKRSPKENLSPGFSHLLSKN 
ESSPIREKTYSEKATDNHVNHSSCPEPVPNGV 
KJCVSVRTAWEKNKSVSYEQCKPVSVTPQGN 
DFEYTAKIRTLAETERFFDELTKEKDQIEAAL 
SRMPSPGGRITLQTRLNQVKCLSLNLL 


368 


1718 


A 


3163 


2 


2350 


" EFKSGGCGAGLVAAGAVLVLYPASRAGERT 
RVPGSPAPSSLPLHSPGACGTEVDMDPQRSPL 
LEVKGNI ELKRPLIKAPSQLPLSGSRLKRRPDQ 
MEDGLEPEKKRTRGLGATTKITTSHPRVPSLT 
TVPQTQGQTTAQKVSKKTGPRCSTAIATGLK 
NQKPVPAVPVQKSGTSGVPPMAGGKKPSKRP 
AWDLKGQLCDLNAELKRCRERTQTLDQENQ 
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runinu aim bCLjucjicc \f\ /\icuiiiic i_, — o/aLcinc, 
D= As parti c Acid, E=Grutamtc Acid, 
F=Phenylalanine, G=G lycine, H=Histidine., 
I=Isoleucine, K^ysine, L-Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q^KjJutamine, R^Arginine, S^erine, 
T=Threonine, V=Valine, W=Tiyptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, V=possible 
nucleotide insertion 














QLQDQLRDAQQQVKALGTERTTLEGHLAKV 

QAQAEQGQQELKNLRACVLELEERLSTQEGL 

VQELQKKQVELQEERRGLMSQLEEKERRLQT 

SEAALSSSQAEVASLRQETVAQAALLTEREER 

LHGLEMERRRLHNQLQELKGNIRVFCRVRPV 

LPGEFrPPPGLLLFPSGPGGPSDPPTRLSLSRSD 

ERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDE 

VFEEIAMLVQSALDGYPVaFAYGQTGSGKTF 

QGWTYSFVASYVE1YNETVRDLLATGTRKGQ 
GGECEIRRAGPGSEELTVTNARYVPVSCEKEV 
DALLHLARQNRAVARTAQNERSSRSHSVFQL 

ALGPGERERLRETQAINSSLSTLGLVIMALSN 
KESHVPYRNSKLTYLLQNSLGGSAKMLMFV 

NISPLEENVSESLNSLRFASKVEPSVLFGTAQS 
NRlCWKTDPni CVCVCVCVCVCVCVCVCVP 

MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


369 


1719 


A 


3165 


365 


12 


GYTSQGRWIDIERGPLTANTESLMENNFNALP 

IYRFDAIPVKILTRFFINLDKL1LKPVLKTKIAK 
NRIKTFYIMRRKKLGDSS 


370 


1720 


A. 


3170 


393 


42 


GASISPSAVIDGVEGLKPMQEQEAQEAGPCLD 
*T4MAPFOWVAPR\T*T T FUT IFSV1 HAT II A A A A 

QS SAEEDEDPRN* GQS SEDQAPNQNGLIVIVH 
RVHVPLGAAATVPVHRSHFPR 


371 


1721 


A 


3173 


770 


510 


GNGGCGLSQJPPSHLGAFSRGSLLSRG\DPRGP 

PPHPV1FFVFVVE\QGFTVLARMVSIS*PCDPP 

ALASQSAG1TGVSHXARPQNLYF 


372 


1722 


A 


3180 


381 


76 


RVLHHDNVPAHSSPQKREISQEFQLEIRHLP+S 
PDLAPSGCFLFLNLKNIFK\GTHFSLVDNV2CK 
TVSTWLH/SQNAQFYKDRLNGWYHCLQKCL 
QHY*AYVEK 


373 


1723 


A 


3181 


410 


M101 


RREVAGPEGKGLLLASAHTMLTPPLLLLLPLL 

SALVAAA1DAPKTCSPKQFACRDQ1TC1SKGW 

RCDGERDCPDGSDEAPEICPQSKAQRCQPNE 

HNCLGTELCVPMSRLCNGVQDCMDGSDEGP 

HCRELQGNCSRLGCQHHCVPTLDGPTCYCNS 

SFQLQADGKTCKDFDECSVYGTCSQLCTNTD 

GSFICGCVEGYLLQPDNRSCKAKNEPVDRPP 

VLLIANSQNILATYLSGAQVSTITPTSTRQTTA 

MDFSYANETVCWVHVGDSAAQTQLKCARM 

PGLKGFVDEHTINISLSLHHVEQMAIDWLTGN 

FYFVDDIDDR1FVCNRNGDTCVTLLDLELYNP 

KG1ALDPAMGKVFFTDYGQ1PKVERCDMDG 

QNRTKLVDSKIVFPHGITLDLVSRLVYWADA 

YLDYIEWDYEGKGRQTHQGILIEHLYGLTVF 

ENYLYATNSDNANAQQKTSVIRVNRFNSTEY 

QWTRVDKGGALHIYHQRRQPRVRSHACEN 

DOYGKPGGCSDICLLANSHKARTCRrRSGFS 

LGSDGKSCKKPEHELFLVYGKGRPGIIRGMD 

MGAKVPDEHMIPIENLMNPRALDFHAETGFI 

YFADTTSYL1GRQKJDGTERETILKDGIHNVE 

GVAVDWMGDNLYWTDDGPKKTISVARLEK 

AAQTRKTLIEGKMraPRAlVVDPLNGWMYW 

TDWEEDPKDSRRGRLERAWMDGSHRDIFVT 

SKTVLWPNGLSLDIPAGRLYWVDAFYDRIET1 

LLNGTDRKJVYEGPELNHAFGLCHHGNYLFW 

TEYRSGSVYRLERGVGGAPPTVTLLRSFARPPI 
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Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Ghitamic Acid, 
^Phenylalanine, OGlycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possibIe 
nucleotide insertion 



FEIR\MYDAQHQQVGSNKCRVNNAGCSSLCL 

ATPGSRQCACAEDQVLDADGVTCLANPSYVP 

PPQCQPGEFACANSRC1QERWKCDGDNDCLD 

NSDEAPALCHQHTCPSDRFKCENNRC IPNRW 

LCDGDNDCGNSEDESNATCSARTCPPNQFSC 

ASGRCIPISWTCDLDDDCGDRSDESASCAYPT 

CFPLTQFTCNNGRONINWRCDNDNDCGDNS 

DEAGCSHSCSSTQFKCNSGRC1PEHWTCDGD 

NDCGDYSDETHANCTNQATRPPGGCHTDEF 

QCRLDGLCIPLRWRCDGDTDCMDSSDEKSCE 

GVTHVCDPSVKFGCKDSARCISKAWVCDGD 

NDCEDNSDEENCESLACRPPSHPCANNTSVC 

LPPDKLCDGNDDCGDGSDEGELCDQCSLNN 

GGCSHNCSVAPGEGIVCSCPLGMELGPDNHT 

CQIQSYCAKHLKCSQKCDQNKFSVKCSCYEG 

WVLEPDGESCRSLDPFKPFIIFSNRHEIRRIDLH 

KG D Y S VL V PG LRNTI ALDFHL S Q SAL Y W TD V 

VEDKIYRGKLLDNGALTSFEYVIQYGLATPEG 

LAVDWIAGNIYWVESNLDQIEVAKLDGTLRT 

TLLAGDIEHPRAIALDPRDGILFWTDWDASLP 

RIEAASMSGAGRRTVHRETGSGGWPNGLTV 

DYLEKJRJLWIDARSDAIYSARYDG SGHMEVL 

RGHEFL SHPFAVTLYGGEVYWTDWRTKILA 

KANKWTGHNVTVVQRTNTQPFDLQVYHPSR 

QPMAPNPCEANGGQGPCSHLCLINYNRTVSC 

ACPHLMKLHKDNTTCYEFKKFLLYARQMEIR 

GVDLDAPYYNYIISFTVPDIDNVTVLDYDARE 

QRVYWSDVRTQAIKRAFINGTGVETVVSADL 

PNAHGLAVDWVSRNLFWTSYDTOKKQINVA 

RLDGSFKNAWQGLEQPHGLWHPLRGKLY 

WTDGDNISMANMDGSNRTLLFSGQKGPVGL 

AIDFPESKLY WISSGNHTINRCNLDG SGLEVID 

AMRSQLGKATALAIMGDKLWWADQVSEKM 

GTCSKADGSGSWLRNSTTLVMHMKVYDESI 

QLDHKGTNPCSVNNGDCSQLCLPTSETTRSC 

MCTAGYSLRSGQQACEGVGSFLLYSVHEGIR 

GIPLDPNDKSDALVPVSGTSLAVGIDFHAEND 

TIYWVDMGLSTISRAKRDQTWREDVVTNGIG 

RVEGIAVDWIAGNIYWTDQGFDVIEVARLNG 

SFRYWISQGLDKPRAITVHPEKGYLFWTEW 

GQYPRIERSRLDGTERVVLVNVSISWPNGISV 

DYQDGKLYWCDARTDKIERIDLETGENREVV 

LSSNNMDMFSVSVFEDFIYWSDRTHANGSIK 

RGSKDNATDS VPLRTG1GVQLKDIK VFNRDR 

QKGTNVCAVANGGCQQLCLYRGRGQRACA 

CAHGMLAEDGASCREYAGYLLYSERTILKSI 

HLSDERNLNAPVQPFEDPEHMKNVIALAFDY 

RAGTSPGTPNR1FFSDEHFGN1QQINDDGSRJRIT 

IVENVGSVEGLAYHRGWDTLYWTSYTTSTIT 

RHTVDQTRPGAFERETVITMSGDDHPRAFVL 

DECQNLMFWTNWNEQHPSIMRAALSGANVL 

TLIEKDIRTPNGLAIDHRAEKLYFSDATLDKIE 

RCEYDGSHRYVILKSEPVHPFGLAVYGEHIF 

WTDWVRRAVQRANKHVGSNMKLLRVDIPQ 

QPMGIIAVANDTNSCELSPCRINNGGCQDLCL 

LTHQGHVNCS CRGGRILQDDLTCRAVNS SCR 

AQDEFECANGECINFSLTCDGVPHCKDKSDE 

KPSYCNSRRCKKTFRQCSNGRCVSNMLWCN 

GADDCGDGSDEIPCNKTACGVGEFRCRDGTC 

IGNSSRCNQFVDCEDASDEMNCSATDCSSYF 
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RLGVKGVLFQPCERTSLCYAPSWVCDGAND 

CGDYSDERDCPGVKRPRCPLNYFACPSGRCIP 

MSWTCDKEDDCEHGEDETHCNKFCSEAQFE 

CQNHRCISKQWLCDGSDDCGDGSDEAAHCE 

GKTCGPSSFSCPGTHVCVPERWLCDGDKDCA 

DGADESIAAGCLYNSTCDDREFMCQNRQCIP 

KHFVCDHDRDCADGSDESPECEYPTCGPSEF 

RCANGRCLSSRQWECDGENDCHDQSDEAPK 

NPHCTSPEHKCNASSQFLCSSGRCVAEALLCN 

GQDDCGDSSDERGCHINECLSRKLSGCSQDC 

EDLKIGFKCRCRPGFRLKDDGRTCADVDECS 

TTFPCSQRCINTHGSYKCLCVEGYAPRGGDP 

HSCKAVTDEEPFLIFANRYYLRKLNLDGSNY 

TLLK Q GLNNA V ALDFD YRE QM1 Y WTD V TTQ 

GSM1RRMHLNGSNVQVLHRTGLSNPDGLAV 

DWVGGNLYWCDKGRDTIEVSKLNGAYRTVL 

VSSGLREPRALVVDVQNGYLYWTDWGDHSL 

1GRIGMDGSSRSV1VDTK1TWPNGLTLDYVTE 

RIYWADAREDYIEFASLDGSNRHVVLSQDIPH 

IFALTLFEDYVYWTDWETKSINRAHKTTGTN 

KTLLISTLHRPMDLHVFHALRQPDVPNHPCK 

VNNGGCSNLCLLSPGGGHKCACPTNFYLGSD 

GRTCVSNCTASQFVCKNDKCIPFWWKCDTE 

DDCGDHSDEPPDCPEFKCRPGQFQCSTGICTN 

PAFICDGDNDCQDNSDEANCDIHVCLPSQFK 

CTNTNRCIPGIFRCNGQDNCGDGEDERDCPE 

VTCAPNQFQCSITKRCIPRVWVCDRDNDCVD 

GSDEPANCTQMTCGVDEFRCKDSGRCIPARW 

KCDGEDDCGDGSDEPKEECDERTCEPYQFRC 

KNNRCVPGRWQCDYDNDCGDNSDEESCTPR 

PCSESEFSCANGRCIAGRWKCDGDHDCADGS 

DEKDCTPRCDMDQFQCKSGHCIPLRWRCDA 

DADCMDGSDEEACGTGVRTCPLDEFQCNNT 

LCKPLAWKCDGEDDCGDNSDENPEECARFV 

CPPNRPFRCKNDRVCLWIGRQCDGTDNCGD 

GTOEEDCEPPTAHTTHCKDKKEFLCRNQRCL 

SSSLRCNMFDDCGDGSDEEDCS1DPKLTSCAT 

NASICGDEARCVRTEKAAYCACRSGFHTVPG 

QPGCQDINECLRFGTCSQLCNNTKGGHLCSC 

ARNFMKTHNTCKAEG SE YQ VL YIADDNEIRS 

LFPGHPHSAYEQAFQGDESVR1DAMDVHVKA 

GRVYWTOWHTGT1SYRSLPPAAPPTTSNRHR 

RQIDRGVTHLN1SGLKMPRGIAIDWVAGNVY 

WTDSGRDVIEVAQMKGENRKTLISGMIDEPH 

AIWDPLRGTMYWSDWGNHPKIETAAMDGT 

LRETLVQDNIQWPTGLAVDYHNERLYWADA 

KLSVIGSIRLNGTDPIVAADSKRGLSHPFSIDV 

FEDYIYGVTYINNRVFKIHKFGHSPLVNLTGG 

LSHASDVVLYHQHK QPE VTNPCDRKJCCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPP 

PDAPRPGTCNLOCFNGGSCFLNARROPKCRC 

QPRYTGDKCELDQCWEHCRNGGTCAASPSG 

MPTCRCPTGFTGPKCTQQVCAGYCANNSTCT 

VNQGNQPQCRCLPGFLGDRCQYRQCSGYCE 

NFGTCQMAADGSRQCRCTAYFEGSRCEVNK 

CSRCLEGACWNKQSGDVTCNCTDGRVAPS 

CLTCVGHCSNGGSCTMNSKMMPECQCPPHM 

TGPRCEEHVFSQQQPGHIASILIPLLLLLLEVL 

VAGVVFWYKRRVQGAKGFQHQRMTNGAM 

NVEIGNPTYKMYEGGEPDDVGGLLDADFAL 
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DPDKPTNFTNPVYATLYMGGHGSRHSLASTD 
EKRELLGRGPEDEIGDPLA 


374 


1724 


A 


3187 


191 


1815 


CLELASAGKIPEESKALSLLAPAPTMTSLMPG 

AGLLPIPTPNPLTTLGVSLSSLGAIPAAALDPNI 

ATLGEIPQPPLMGNVDPSKIDEIRRTVYVGNL 

NSQTTT ADQLLEFFKQ VGEVKF VRMAG DET 

QPTRFAFVEFADQNSVPRALAFNGVMFGDRP 

LKINH SNNAI VKPPEMTPQ AAAKJELEEVMKR 

VRE AQSFISAAIEPGWLHSTSLCNDFLGCF* RR 

RMYRE* APCTICGTFHLCLIIN WDL * LF* A YTA 

K*FFPPRVWKEQ*KKRR\RSRSHTRSKSRSSSK 

SHSRRKRSQSKHRSRSHNRSRSRQKDRRRSK 

SPHKKRSKSRERRKSRSRSHSRDKRKDTREKI 

KEKERVKEKDREKEREREK£REKEK£RGKN 

KDRDKEREKDREKDKEKDREREREKEHEKD 

RDKEKEKEQDKEKEREKDRSKEIDEKRKKDK 

KSRTPPRSYNASRRSRSSSRERRRRRSRSSSRS 

PRTSKTIKRKS SRSPSPRSRNKKDKKREKERD 

HISERRERERSTSMRKSSNDRDGKEKLEKNST 

S 


375 


1725 


A 


3192 


415 


101 


AHSSHQTRAILQEFQWDIIRHPPIASPNLALSG 
FVFPNLKKSLRGTflFS S VKK\TTLTWLNSQDP 
WF/FFYP* SPDLQIPSSFRNGLNDWYHHSQKC 
PDLDGAYVKK 


376 


1726 


A 


3199 


931 


418 


G V* WCDLGSPQPPPPGFKQFCLGRSSS WD YR 
HVPPHPANFVFLLETGFLHAGQAGIAGDPPAS 
ASQSAGITGVSHTWPKNHLIFYACLVIRSKRI 
K 


377 


1727 


A 


3201 


274 


1285 


KTGYTSRGSPLSPQSSIDSELSTSELEDDSISM 

GYKLQDLTDVQIMARLQEESLRQDYASTSAS 

VSRHSSSVSLSSGKKGTCSDQEYDQYSLEDEE 

EFDHLPPPQPRLPRCSPFQRGEPH SQTFSS IREC 

RRSPSSQYFPSNNYQQQQ YY SPQAQTP D QQP 

NRTNGDK/PPKKYA*PSPDAKYNCH* * QHXSSP 

VTVRNSQSFDSSLHGAGNGISR1QSCIPSPGQL 

QHRVHSVGHFPVSIRQPLKATAYVSPTVQGSS 

NMPLSNGLQLYSNTGIPTPNKAAASGIMGRS 

ALPRPSLAINGSNLPRSKIAQPVRSFLQPPKPL 

SSLSTLRDGNWRDGCY 


378 


1728 


A 


3202 


112 


1789 


VPGVTESRPSVLRGDHLFALLSSETHQEDPIT 

YKGFVHKV\ELDRVKLSFSMSLLSRFVGWG* 

PFKVNFY/TFNRQPLRV\QHRALELTGRWLLW 

PMLFPWAPRDVPLLPSDVKLKLYDRSLESNP 

EQLQAMRHIVTGTTRPAPYIIFGPPGTGKTVT 

LVEAIKQVVKHLPKAHILACAPSNSGADLLC 

QRLRVHLPSSIYR1XAPSRDIRMVPEDIKPCCN 

WDAKKGEYVFPAKKKLQEYRVLnTLITAGR 

LVSAQFPIDHFTHIFIDEAGHCMEPESLVAIAG 

LMEVKETGDPGGQLVLAGDPRQLGPVLRSPL 

TQKHGLGYSLLERLLTYNSLYKKGPDGYDPQ 

FITKJLLRNYRSHPTILDIPNQLYYEGELQACA 

DVVDRERFCRWAGILPRQGFPIIFHGVMGKD 

EREGNSPSFFNPEE AATVTS YLKLLL APS SKX 

GKARLSPRSVGVISPYRKQVEKIRYCITKLDR 

ELRGLDDIKDLKVTCCSTVTPCLPCAPTCPLP 

ETSSSFHSSPRPRPTPAALNRARALPEPLTPGD 

SNLRVWDGIRKPACLTNTSCHS 


379 


1729 


A 


3206 


432 


130 


PKAAPSVXL WFPPFL* GSFKPTKGHTXC VX1K 
* LSTREAXDSXPGRQIAXXRQG GKVETTTAL 
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XKQSNNKGTRASSYXEPDAXEQWKFPHKKL 
QLPGXTHE 


380 


1730 


A 


3207 


187 


507 


GGTGHPHPARPPLSGVGGCQCSHSKPWTAGS 
PEQRDHPAPHKQIEAGQGLPGPQAWGG*KGP 
AXLLPGPGGGPGPVASLEARAQASSGV1TNG 
GGRTYPYPTFSSGE 


381 


1731 


A 


3225 


1 


840 


GTKPGHT PAPSDGFPV/HT *<?TP < 5wr;<iP* npcT / 

EMQLITSLGLQEFDIARNVLELIYAQTLVWIGI 
FFCPLLPFIQMIMLFIMFYSKNISLMMNFQPPS 
KAWRASOMMTFFTFT 1 FFP^FTHVl PTI A1TI 

WRLKPSADCGPFRGLPLFIHSIYSWIDTLSTRP 
GYLWWWIYRNLIGSVHFFFILTLIVLIITYLY 
WQITEGRKJMIRLLHEQI INEGKDKMFLIEKLI 
KLQDMEKKANPSSLVLKRREVEQQGFLHLGE 
HDGSLDLRSRRSVQEGNPRA 


382 


1732 


A 


3238 


256 


38 


LLMIKVSSTCFSCHLHHHHHHHHRHHQGHNS 

LFFSLKSSSNSSTLPVYLSYNIILVFSKCLVFDF 
LFSNACL 


383 


1733 


A 


3241 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKAD 
KVTMLWNKKATAVLVIASTDVDKTGASYYG 
EQTLHYIATNGESAVVQLPKNGPIYDWWNS 
SSTEFCAVYGFMPAKATIFNLKCDPVFDFGTG 

PP"\TA A VVQPUfllJTT \/l A/^ITfXTT TT /"\T* A T\ fw\ KIT 

risiNAA i i orrlOrllL v JL f AUrLiNL.lL t Ql* AD/IMK. 

VWNVKNYKLISKPVASDSTYFAWCPDGEHIL 

TATCAPRLRVNNGYKIWHYTGSILHKYDVPS 

NAELWQ VS WQPFLDG IFP AKTITYQA VPSEVP 

NEEPKVATAYRPPALRNKPITNSKLHEEEPPQ 

NMKPQSGNDKPLSKTALKNQRKHEAKKAAK 

QEARSDKSPDLAPTPAPQSTPRNTVSQSISGDP 

EIDKKIKNLKKKLKAIEQLKEQAATGKQLEK 

NOT FKIOKFTAT T OFT Pni FT 


384 


1734 


A 


3242 


3 


678 


mSPAARSPGLETPTCLLFVIAAIAAVFVDSAIP 

RLTQHRPQDGSFPYTILDPPLYLPGQCAPPQP 

LSQCARRVHGEKLRRPTFGPRHRGAGTAKMS 

ASLVRATVRAVSKRKLQPTRAALTLTPSAVN 

KIKQLLKDKPEHVGVKVGVRTRGCNGLSYTL 

EYTKTKGDSDEEVIQDGVRVFIEKKAQLTLL 

GTEMDYVEDKLSSEFVFNNPNIKGTCGCGES 
FNI 


385 


1735 


A 


3243 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPL 

KEEEILPEPGSETPTVASEALAELLHGALLRR 

GPEMGYLPGPPLGPEGGF^ETTTTnTTTTVTT 

TVTSPVLCNNNISEGEGYVESPDLGSPVSRTL 

GLLDCTYSIH V YPGY GIEIQVQTLNLSQEEELL 

VLAGGG SPGL APRLL ANSSMLGEGQ VLRSPT 

NRLLLHFQSPRVPRGGGFRIHYQAYLLSCGFP 

PRPAHGDVSVTDLHPGGTATFHCDSGYQLQG 

EETLICLNGTRPSWNGETPSCMASCGGTIHNA 

TLGRJVSPEPGGAVGPNLTCRWVIEAAEGRRL 

HLHFERVSLDEDNDRLMVRSGG SPLSPVIYDS 

DMDDVPERGLISDAQSLYVELLSETPANPLLL 

SLRFEAFEEDRCFAPFLAHGNVTTTDPEYRPG 

ALATFSCLPGYALEPPGPPNAIECVDPTEPHW 

NDTEPACKAMCGGELSEPAGWLSPDWPQS 

YSPGQDCVWGVHVQEEKRILLQVEILNVREG 

DMLTLFDGDGPSARVLAQLRGPQPRRRLLSS 

GPDLTLQFQAPPGPPNPGLGQGFVLHFKEVPR 

NDTCPELPPPEWGWRTASHGDLIRGTVLTYQ 

CEPGYELLGSDILTCQWDLSWSAAPPACQKI 
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MTCADPGEIANGHRTASDAGFPVGSHVQYRC 

LPGYSLEGAAMLTCYSRDTGTPKWSDRVPKC 

ALKYEPCLNPGVPEN GYQTLYKHH YQAGE SL 

RFFCYEGFELIGEVTITCVPGHPSQWTSQPPLC 

KVTQTTDPSRQLEGGNLALAILLPLGLVIVLG 

SGVYIYYTKLQGKSLFGFSGSHSYSPITVESDF 

SNPLYEAGDTREYEVSI 


386 


1736 


A 


3250 


5725 


3984 


GTSTVTMATKXHFSIILNLLGMLLKKDNQDT 

RKLLMTWALEVAVVMKKSETYAPLFCLPSF 

HKFCKGLLADTLVEDVNICLQACSSLHALSSS 

LPDDLLQRCVDVCRVQLVHRGTCJRQAFGKL 

LKSIPLG VFLSNNNHTEIQEI SLALRSHMSKAP 

SNTFHPQDFSD/VISFIL YGN SHRTGKDNWLE 

RLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 

AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGIIR 

SL AGHTLN PDQDV SQ WTTADNDEGHGNNQL 

RLVLLLQYLENLEKLMYNAYEGCANALTSPP 

KV1RTFLYTNRQTCQD WLTRIRLSIMRVGLLA 

GQPAVTVRHGFDLLTEMKTTSLSQGNELEVSI 

MMVVEALCELHCPEAIQGIAVWSSSIVGKHL 

LWINSVAQQAEGRJFEKASVEYQEHLCAMTG 

VDCCISSFDKSVLTLASAGCKSASLKHCLNGE 

SRKSVLSKPTDSSPEVINYLGNKACECYISTA 

DWAAVQEWQNAIHDLKKSTSSTSLNLKADF 

NYIKSLSSFESGKFVECTEQLELLPGENINLLA 

GGSKEKIDMKKLLRNM 


387 


1737 


A 


3255 


380 


76 


MDIFLYNCKYQVQTEI T NSIQHIMA\SKKLSRF 
LKYVHNL*AENYKTLMK*INEDLNKQRDVPY 
S*TARLNKMSIP'nCTIFRFKAIYIKIPATYFIET 

NMQ 


388 


1738 


A 


3260 


685 


428 


PQWLGLQVYALPPANFVFFVEMRSTILAQTG 
FELLDSSDLPASASKSAGITCMSHHARTLSLK 
*WPFCLSATQEKFC*PASEGVAW 


389 


1739 


A 


3269 


1 


332 


LDG YHTPI YMLNRIIRLPAAL* IISDQTGHALTI 
LTRLETQMINADYQNKLTLDYLLTTDREVYE 
PFNLTNYCLHIHNQRLGAYDLG*V*Q/KLAHV 
PVQ V* HGFDPE AMFR 


390 


1740 


A 


3270 


2 


372 


GRCHDQNKGKS\DGPDAQAEACGGESTYQEL 
LVNQNPIGQPLACRRLTRKIYEGIKJCAVKPNH 
SPRGVKKVHKFVNKGEKGIMVLAGDTLGIGV 
YCLLPCMC*DRKLTYAHIPSTTDLGAGAGY 


391 


1741 


A 


3273 


1 


187 


FFQEMLDIMKAISDMMGKCTYPVLKEDAPRQ 
HVETFFQVEELTRS QEGMKLGENFLMF AMPP 
DDSKESKGK* FFQEMLDIMKAISDMMGKCTY 
PVLKEDAPRQHVETFFQVGINQKSRGHEVRR 
KFPDVCHAPR 


392 


1742 


A 


3281 


901 


521 


FFFGDGVSPCRQAGV*WHDLDSLQNLPPGFK 
RFSYLSLPSSW\DYRHVLPRQANFCIF/M*RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
APPQMDFTFALLCFALKGCLPRQKEGGTLNLI 


393 


1743 


A 


3283 


385 


3 


RNRSVVPEFVLLGLSAGPQTQTLLFVLFWIC 
LLTVMGNLLLLVVINADSCLHTPMYFFLGQL 
SFLDLCHSSVTAPKLLENLLSEKKTISVEGCM 
A* VFFVFATG GTE SSLLAVMA YDRYVAIRTR 
G 


394 


1744 


A 


3284 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKC 
LDNCPEGLEANNHTMECVSIVHCEVSEWNP 
WSPCTKKGKTCGFKRGTETRVREIIQHPSAKG 
NLCPPTNETRKCTVQRKKCQKGERGKKGRE 



174 



WO 01/57188 



PCT/USO 1/03800 



SEQ ID 
NO: of 
nucl- 
eotrde 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

* * 

hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine OCy stein e, 
D=Aspartic Acid, E=GJutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine } 
M=Methionine, N=Asparagine, P=Proline, 
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/^possible nucleotide deletion, \=possible 
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RKRKKPNKGESKE AIPDSKSLE SSKEIPEQREN 
KQQQ 


395 


1745 


A 


3286 


1 


340 


RVLYVPSMGFCILVAHGWQKJSTKSVFKKLS 
WICLSMVILTHSLKTFHRNWDWESEYTLFMS 
ALKVNKNNAKLWNNVGHALENEKNFERAL 
KYFLQATHVQPDDIGAHMNVGR 


396 


1746 


A 


3293 


1 


172 


GFRAVVMTVKTEAAKGTLTYSRMRGMVAIL 
IAFMKQRRMGLNDFIQKIANN SYACKQ 


397 


1747 


A 


3295 


12 


401 


AEPACGASSCTPPSLRSSSSQSVGPLRPGRPL 

WSEACAFL*AAAPQGPASPCCGLPSGFPRVW 

AQCCPPGGALRFPEGLGSVLSPRRCPQVSRGS 

GLSAVPQEVPSGFLGPGLRACPQEAPSRFLRA 
GLT 


398 


1748 


A 


3300 


1912 


2768 


KQRRWQNIQRKGPKRYIVTAGNSQSHQPMIFS 

MLRKLPKVTCRDVLPEIRAICIEEIGCWMQSY 

STSFLTDSYLKYIGWTLHDKHREVRVKCVKA 

LKGLYGNRDLTARLELFTGRFKDWMVSMIV 

DREYSVAVEAVRLLILILKNMEGVLMDVDCE 

SVYPIV*ASN*GLASAVGEFLYWKLFYPECEI 

RTMGGREQRQSPGAQRTFFQLLLSFFVESKSH 

SVTQAGVQWQFSAHRDLCLPGSSNSHVSASR 

VAGIAGAHRHTWLIYVFFSWRQGFAVLAGL 
VSNS 


399 


1749 


A 


3301 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLLPSLLMGYSE 

SPPPITDSWAPFISLTHHVLSQSQSPLSSNCWI 

CLSTHTQ* FTALP ADLLT WTQSN VSLHI S YLAI 

PFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 

GRAVALLHLIASGLTSIQTNTASSKPPIWGY\L 

STQTSFISPPPLCLSRTYPNPAHATMVGQVPQ 

SLCGLIFTL/RTPCRPSILHPNYKJISTSAWQKV 

LCFSGSPTIHTSLHLTTGSSFLSFHPIPGFPAAN 

SALYVSSLKGPPGKNVTIPSPVTGT*QPPHRGS 

N/RLTVDKJDNFFLSPKPNSLHQLPSQVTPYQAL 

TGAALAGSYPIWENENTLSWLPTFTYNFCLST 

PSLFFLCDTN* YLCLPANWSGTCTLVFQ APTI 

NILPPNQTILISVEASISSSPIRNKWALHLITLLT 

GLGITAALGTGIAGITTSITSYQTLFTTLSNTVE 

DMHTSITSLQRQLDFLVGVILQNWRVLDLLT 

TEKGGTCIYLQEECCFCVNESGIVHIAVRRLH 

DRAAEL*HQVADSWWQGSSLLRWIPWVAPF 

LGPLIFLFLLLMIGPCIFNLVSRFISQRLNCFIQ 

ASMQKHIDNIFHLCHV*YQSLRGNHSEAPEPR 
P 


400 


1750 


A 


3303 


2 


453 


THWRHSSGVPGSTTARRRRRELEIATSDNQE 

YYNRLCQEVTNRERNDQKMLADLDDLNRTK 

KYLEERLIELLRDKDALWQKSDALEFQQKLS 

AEERWLGDTEANHCLDCKREFSWMVRRHHC 

RICGRIFCYYCCNNYVLSKHGGKKERCC 


401 


1751 


A 


3304 


1 


626 


MAPQHSSLDDKVPQQASTVCFEFQDILQHSQ 

CTEHKDSLWGPGARSQPFGAHNTRLSPDSCP 

EKIVLRALKDSRAGMPEQDKDPGVQENPDD 

QRRVPQGTGDAPSAFRPLWDNGGLSPFVSRP 

GPLERDLHAQRSEVTYNQRSQSSWMSSFPKR 

NAFVSPYSSMGQAQP/GLPKTNPIGESCCWEG 

LSLSTQILG*QKPSKYIPSLCKR 


402 


1752 


A 


3305 


1678 


172 


MELPSGPGPERLFDSHRLPGDCFLLLVLLLYA 
PVGFCLLVLRLFLGIHVFLVSCALPDSVLRRF 
VVRTMCAVLGLVARQEDSGLRDHSVRVLISN 
HVTPFDHNIVNLLTTCSTVSE SEAESATGRFP 
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M=Methionine, N=Asparagine, P= Proline, 
Q=Glutarnine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possib!e 
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GAQLKAPLSPLAFRMEDTEALPLTP1LYPTCQ 

FFFFMFLNIFLLAFSSPGSQPLLNSPPSFVCWSR 

GFMEMNGRGELVESLKRFCASTRLPPTPLLLF 

PEEEATNGREGLLRFSSWPFSIQDVVQPLTLQ 

VQRTLVSVTVSDASWVSELLWSLFVPFTVY 

QVRWLRPVHRQLGEANEEFALRVQQ\LVAJCE 

LG\QTGTRLTPA\DKAEHMKRQRHPR\LRPQS 

AQSSFPPSPWVLSS/SDVQTGQTLGFREFKESF 

CPHVAIGVFIPERPWPKTGCCKTLTIHL1LL+G 

GPVSFSCPE\DIHPRGT*VPTQQASGLPSFPSYG 

PARGGVL*HPSAQQPLTFA\KSS\WARAGRAL 

QERKQXALYEYARRRFTERRAPGGLD 


403 


1753 


A 


3307 


44 


447 


DPSPSLLAVALGLRAGERTRSGPGSSSPSGGIS 

GGASAGLASSPECACGRSHFTCAVSALGECT 

CIPAQWQCDGDNDCGDHSDEDGCILPTCSPL 

DFHCDNGKCIRRSWVCDSDNDCEDDSDEQD 

CPPRECEED 


404 


1754 


A 


3311 


409 


1 


PRHGWGRRVLGRDRPRLQKVKKSVKAIYIPG 

QDHVQNEEIYARVLDKFGSNFLSRDNADLGT 

AFVKFSTLTK*LSALLKNLLQGLSRNVIFTLDS 

LLKGDLKGVKGDLKKPFDKAWKDYETKFAK 

IEKEKREREWR 


405 


1755 


A 


3322 


12 


458 


AAVPVENPWDDPRVRPRVRIFTWEDCIAGQA 
KVLCM)SYGVTIDWSPKGAFIRLTSQSVGNG 
HPASKENDQMVDTIKKTTKVPII WT YGDMV E 
PRPQMIRPAVGAKHKELWKILMALKKIKMWE 
GKYTKPSQYNPNYMLELAHNDSVW 


. 406 


1756 


A 


3324 


1 


426 


LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAV 

MCVLLWALSLLQSILEWMFCSFLFSDVDSDN 

WCQILDFLTAVWLIFLrJvVLCGFTLVLLVRIIC 

GSQKMPLTRLYVT1LLTGLVFLFCSLPLS1Q+F 

LLYWIEKDLDDL 


407 


1757 


A 


3328 


213 


1841 

i 
i 

\ 


SGDLSPAELMMLTIGDVIKQLIEAHEQGKDID 

LNKVKTKTAAKYGLSAQPRLVDI1AAVPPQY 

RKVLMPK1JKAKPIRTASG1AVVAVMCKPHRC 

PHI SFTGNIC VYCPGGPDSDFE Y STQS YTG YEP 

TSMRAIRARYDPFLQTRHRIEQLKQLGHSVD 

KVEFIVMGGTFMALPEEYRDYFIRNLHDALS 

GHTSNNIYEAVKYSERSLTECCIGITIETRPDYC 

MKRHLSDMLTYGCTRLEIGVQSVYEDVARD 

TNRGHTVKAVCESFHLAKDSGFKWAHMMP 

DLPN VGLERDIEQFTEFFENPAFRPDGLKL YP 

TLV1RGTGLYELWKSGRYKSYSPSDLVELVA 

RILALVPPWTRVYRVQRDIPMPLVSSGVEHG 

NLRELALARMKDLGIQCRDVRTREVGIQEIH 

HKVRPYQVELVRRDYVANGGWETFLSYEDP 

DQDILIGLLRLRKCSEETFRFELGGGVSIVREL 

HVYGSVVPVSSRDPTKFQHQGFGMLLMEEA 

ERIAREEHGSGKIAVISGVGTRNYYRKIGYRL 

QGPYMVKMLK 


408 


1758 


A 


3335 


3 


467 


A1ASPRAAGIRHELTSTMAAGKNKRLTK.GGK 
KGAKKKAV/DNIINIGKTLVTRTQRTKIASDG 
LKGRVFEESLADLQNDVTDGYLLRVI* VAFTT 
ERTNQI/REVFNKLIPDSIGKDIEKACQSIYPLH 
DDFARKVKMLKKPKFELRKLMELHGEGSS 


409 


1759 


A 


3338 


7 


1252 


PRWRNSARDEILLSFPQNYYIQWLNGSLIHGL 
WNLASLFSNLCLFVLMPFAFFFLESEGFAGLK 
KGIRARILETLGMLLLLALL1LGIVWVASALID 
NDAASMESLYDLWEFYLPYLYSC1SLMGCLL 
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LLLCTPVGL\SRMFTVMGQLLVKPTILEDLDE 

QIYIITLEEEAI QRPTTCWAVFIR W/KYNIMELE 

QELENVKTLKTKLERRKKASAWERNLVYPA 

VMVLLLIETSISVLLVAC^JILCLLVDETAMPK 

GTRGPG1GNASLSTFGFVGAALEIILIFYLMVS 

SVYGFYSLRFFGNfTPKKjDDTTMTKIIGNCVS 

ILVLSSALPVMSRTLGITRFDLLGDFGRFNWL 

GNFYIVLSYNLLFAIVT7LCLVRKFTSAVREE 

LFKALGLHKLHLPNTSKDSETAKPSVNGHQK 

AL 


410 


1760 


A 


3339 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLL 

WIA1ACSPVHTTLSKSDAKKAASKTLLEKSQ 

FSDKPVQDRGLVVTDLKAESWLEHRSYCSA 

KARDRHFAGDVLGYVTPWNSHGYDVTKVFG 

SKFTQISPVWLQLKRRGREMFEVTGLHDVDQ 

GWMRAVRKHAKGL\P*CLGSCLRTGLTMISG/ 

Y VLDSED EIEELSKTVVQ VAKNQHFDGF V VE 

VWNQLLSQKRVGLIHMLTHLAEALHQARLL 

ALLVIPPAITPGTDQLGMFTHKEFEQLAPVLD 

GFSLMTYDYSTAHQPGPNAPLSWVRACVQV 

LDPKSKWRSKILLGLNFYGMDYATSKDAREP 

VVGARYIQTLKDHRPRMVWDSQVSEHFFEY 

KKSRSGRHWFYPTLKSLQVRLELARELGVG 

VSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 

PWSE 


411 


1761 


A 


3342 


74 


2701 


VATRKLAKGFTQFAKMTEGTKKTSKKFKFFK 
FKGFGSFSNLPRSFTLRRSSASISRQSHLEPDTF 

eatqddmvtvpksppayarssdmyshmg™ 

prpsikkaqnsqaarqaqeagpkpnlvpggv 

pdppgleaakevmvkatgpledtpamepnps 

avevdpirkpevptgdveeerpprdvhseraa 

gepeagsdyvkfskeky1ldsspeklhkelee 

elklsstdlrshawyhgriprevsetlvqrn 

gdfliiu)sltslgdyvltcrwknqalhfk3n 

kwvkagesythiqylfeqesfdhvpalvry 

hvgsrkavseqsga1iycpvnrtfplryleas 

yglgqg sskpaspvspsgpkgshmkrrs vtm 

tdgltadkvtrsdgcptstslprprdsirsca 

lsmdqipdlhspmspisespsspaystvtrvha 

apaapsatalpaspvarrssepqlcpgsapkt 

hgesdkgphtspshtlgkaspspslssysdpds 

ghycqlqppvrgsrewaatetssqqarsyge 

rlkelsengapegdwgki ¥\ ' vpi vevtssfnp 

atfqslliprdnrplevgllrkvkellaevda 

rtlarhvtkvdclvarjlgvtkemqtlmgv 

rwgmelltlphg\rklrldllerfhtmsiml 

avdilg ctgsaeeraallhktiql aaelrgt 

mgnmfsfaavmgaldmaqisrleqtwvtlr 

qrhtegailyekklkpflkslnegkegpplsn 

ttfphvlphtllecdsappegpepwgstehgv 

EWLAHLEAARTVAHHGGLYHTNAEVKLQG 
FQARPELLEVFSTEFQMRLLWGSQGASSSQA 
RRYEKFDKVLTALSHKLEPA VRS SEL 


412 


1762 


A 


3347 


1 


898 


IDRAAECRTKPLPMAVSIRGNADSIVACLVLM 

VLYLIKKRLVACAAVFYGFAVHMKIYPETYI 

LPITLHLLPDRDNDKSLRQFRYTFQACL*ELL 

KRLCNRTALMFVAVAGLTFFALSFGFYYEYG 

WEFLEHTYFYHLTRRDIRHNFSPYFYMLYLT 

AESKWSF SLG1AAFLPQLILLSAV SFAYYRDL 

VFCWFLHTSIFVTFNKVCTSQYFLWYLCLLPL 
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V MPL VRMP WKRA VV LLML WFI G QAM W LAP 
AYVLEFQGKNTFLFIWLAGLFFLLINCSILIQI1 
SHYKEEPLTERIKYD 


413 


1763 


A 


3361 


3 


474 


PIPVRWNSLEGRLLRGYEQHANDGKDYISRN 

♦DLRSWTAADMAAQITKRKWEAEEFAEQIKA 

YLEGTCVER/LRTHLENGKETLQLTEQSSQPTI 

PIVGIVAGLVLLGAVVTGAWSAVMCRKKNS 

GHFLPTDRVSYSEAASSDHAQGSDVSLTACK 

V 


414 


1764 


A 


3363 


14SS 


453 


HQILELKKKILKTYNPDYDEDLVQEASSEDVL 

GVHMVDKDTERDIEMKRQLRRLRELHLYST 

WKKYQEAMKTSLGVPQRERDEGSLGKPLCP 

PEILSETLPGSVKKRVCFPSEDHLEEFIAEHLP 

EA SNQ SLLT V AHAD AGTQTNGDLEDLEEHGP 

GQTVSEEATEVHMMEGDPDTLAELLIRDVLQ 

ELS SYNGEEE\DPEEVKTSLGVPQRGDLEDLE 

EHVPGQTVSEEATGVHMMQVDPATLAKSDL 

EDLEEHVPEQTVSEEATGVHMMQVDPATLA 

KQLEDSTITGSHQQMSASPSSAPAEEATEKTK 

VEEEVKTRKPKKKTRKPSKKSRWNVLKCWD 

IFNIF 


415 


1765 


A 


3369 


431 


315 


IPWSWVGRLSVRKMSILF*LTYNYNA1LNKTP 
PSFSPSL 


416 


1766 


A 


3373 


42 


651 


RQEKMGLGEIGASGVLRSMLKERKKQNMKG 

NGNVTLTPLLPAVQCGCHLQPAGRSPLPSSHS 

APGLCSPLHPLQPQQEASTCPSGTLQGREKAA 

PGQGRPLCSLWAGGAGA\PGERGAEGRGPSD 

QAPDPKSGPWLFPPGLGAPAEVRLHNVPHNL 

RRPPLP*ARGK*PPNSGCPWSEGRAKQPLSCG 

PKPQCSLPSQVPGDTH 


417 


1767 


A 


3382 


2 


2061 


EAQDPRACGPDAGGRFAARDAPGN SLRPPP S 

SPP/GWPGQLRLLPRVPGSELRCGKPERGRLP 

ASPPGKIRGWPPGISKRPGLGGRSFPPGFAPRT 

WRPEARGPSVQSLPPIFSPQSAQTTAR* RPGAP 

KNAGRCGGA\RGPRLSLGPPPGPPPAPALPAR 

AS AG AG AAAAAL A VGG VRGA GG ARGTG GY 

GHCSGR/PTGRTGPGPQGPGPPMPARPR*AS\S 

TRGSRRGPGSRPARAAAAPRAGDHGRRPVRV 

HLRQHTAV*EPRLGDATAPPGGAAGPGAPAP 

R\GPGWDCALLPSPGPRSPRAVGCAEPEIWDP 

SPRRGTSPVPSVRSLRSEPANPRLGLPALLNSY 

PLKGPGLPPPWGPRTQTGHVUTVQPSGSCIEH 

SKSLD/RGPWGAPPWGPSSSGLCSPKLATAGP 

PQSWGLCQIGRRRGLGGPGLKRGET/GLL*GC 

SMDHANRTKGPGVPTSNRCFSHIPG\GDGCSD 

HSSCEGHPDLHAGREMPAAPGLSELERVRFT 

VGCGGLASGISSASVSGLSPNRAGGPGQGDW 

EMYPVSWQTQESGGQG/SPKTGR*VGMLQA 

GAGSLQGGTGDGVWGLWEDGP/RG*DSPLPS 

GTGTEP*'l'FriSIPFFPQPSGVYPSRATLLPMPS 

Y * ALGPS ANKSEKPLL SFLYRGLCCRJ SLQL A 

KGIGQLSEIPLLNVETAFWSMWVTYFRK 


418 


1768 


A 


3398 


304 


2121 


EEEEEEEDEDDDDNNEEEEFECYPPGMKVQV 

RYGRGKNQKMYEASIKDSDVEGGEVLYLVH 

YCGWNVRYDEWIKADKJVRPADKNVPKIKH 

RKKIKNKJLDKEKDKDEKYSPKNCKPPALGPN 

PPFQTNPISWKWYPKLDLTDAKNSDTAHIKSI 

EITSILNGLQASESSAEDSEQEDERGAQDMDN 

NGKEESKIDHLTNNRNDLISKEEQNSSSLLEE 



178 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 

- 


1 SEQ 
ID NO: 
in 

USSN 
09/496 
914 


1 Predicted 
beginning 
nucleotide 
location 
correspond] 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, 
D=Aspartic Acid, E^GIutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L-Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
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NKVHADLVISKPVSKSPERLRKDIEVLSEDTD 

YEEDEVTKKRKDVKKDTTDKSSKPQIKRGKR 

RYCNTEECLKTGSPGKKEEKAKNKESLCMEN 

SSNSSSDEDEEETKAKMTPTKKYNGLEEKRK 

SLRTTGFYSGFSEVAEKRJKLLNNSDERLQNS 

RAKDRKDVWSSIQGQWPKKTLKELFSDSDTE 

AAASPPHPAPEEGVAEESLQTVAEEESCSPSV 

ELEKPPPVNVDSKPIEEKTVEVNDRKAEFPSS 

GSNFSAMPLPYLHLNRLHQSL* QKGSRQQSS 

VTVSEPLAPNQEEVRSIKSETDSnEVDSVAGE 

LQDLQSERE* LASRF* CQCELKQ* * SARTRTS * 

KSLYRSEKSERCSGRRKFIKKAEKKP*SNSGK 

QQKEGK 


419 


1769 


A 


3399 


206 


463 


QRECLSIHIGQAGIQIGDACWELYCLEHGIQP 
NGVVLDTQQDQLENAKMEHTNASFDTFFCE 
TRAGKHVPRALFVDLEPTVIDGIR 


420 


1770 


A 


3408 


1010 


685 


RRLSFFF*IWSSVLVTQARVQWRDLGSPQPLP 
PGFKRFSCLSLPSSWDYRHPSPRPVNF/HVFLV 
VMGFHHVGQAGLELLTSGDLPALASQSAR1T 
GVNHCAQPRGHFH 


421 


1771 


A 


3409 


355 


1326 

• 


ADSNLrESCWQELGLGPWGGDWRVEQVGAS 
ASLRFPREVCSIRFLFTAVSLLSLFLSAFWLGL 
LYLVSPLENEPKEMLTLSEYHERVRSOGOOI 

QQLQAELDKLHKEVSTVRAANSERVAKLVF 

QRLNEDFVRKPDYALSSVGASIDLQKTSHDY 

ADRNTAYFWNRFSFWNYARPPTVILEPHVFP 

GNCWAFEGDQGQVVIQLPGRVQLSDITLQHP 

PPSVEHTGGANSAPRDFAVFFLLSFFTHQGLQ 

VYDETEVSLGKFTFDVEKSEIQTFHLQNDPPA 

AFPKVKIQILSNWGHPRFTCLYRVRAHGVRT 

SEGAEGSAQGPH 


422 


1772 


A 


3412 


2 


421 


EFDAQPSIGALVVFKRP*ATTGSDPGPKRGMN 

YLVSCSMRSPESGKGEPGTARDYTPMGRPPP 

PVPSVSPGPLPGSLAIAPHSPEPHPWEQQPPRG 

QARSPPGGWLGSAT/RVRRPHNHP/RGH/HSP 

VDTAGAPASPGPDVCE 


423 


1773 


A 


3420 


91 


706 


DAQRAIYSSVGPAVSLRQRQQDGAVKESGR/ 
RGGVRSFSRAAAAMAPIKVGDAIPAVEVFEG 
EPGNKVNLAELFKGKKGVLFGVPGAFTPGCS 
KTHLPGFVEQAEALKAKGVQVVACLSVNDA 
FVTGEWGRAHKAEGKVRLLADPTGAFGKET 
DLLLDDSLVSIFGNRRLKRFSMVVQDGIVKA 
LNVEPDGTGLTCSLAPNHSQL 


424 


1774 


A 


3421 


4 


7688 


RQVTRVGTRVLGSTTAAVFLSVEDDNDNAPQ 

FSEKRYVVQVREDVTPGAPVLRVTASDRDKG 

SNAWHYSIMSGNARGQFYLDAQTGALDVV 

SPLDYETTKEYTLRVRAQDGGRPPLSNVSGL 

VTVQVLDINDNAPIFVSTPFQATVLESVPLGY 

LVLHVQAIDADAGDNARLEYRLAGVGHDFP 

FTINNGTGWISVAAELDREEVDFYSFGVEAR 

DHGTPALTASASVSVTALDVNDNNPTFTQPE 

YTVRLNEDAAVGTSVVTVSAVDRDAHSVITY 

QITSG^rTRNRFSITSQSGGGLVSLALPLDYKLE 

RQYVLAVTASDGTRQDTAQIWNVTDANTH 

RPVFQSSHYTVNVNEDRPAGTTVVUSATDE 

DTGENARITYFMEDSIPQFRIDADTGAVTTQA 

ELDYEDQVSYTLAITARDNGIPQKSDTTYLEI 

LVNDVNDNAPQFLRDSYQGSVYEDVPPFTSV 

LQISATORDSGLNGRVFYTFQGGDDGDGDFI 
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F-Phenylalanine, G=Glycine, H=Histidine, 
I=Jsoleucine r K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 












* 


VESTSGIVRTLRRLDRENVAQYVLRAYAVDK 

GMPPARTPMEVTVTVLDVNDNPPVFEQDEFD 

VFVEENSPIGLAVARVTATDPDEGTNAQIMY 

Q1VEGNIPEVFQLDIFSGELTALVDLDYEDRPE 

Y VL VIQATSAPL VSRATVHVRI XDRNDNPPV 

LGNFEILFNNYVTNRSSSFPGGAIGRVPAHDP 

DI SDSLTYSFERGNELSL VLLNASTG ELKLSR 

ALDNNRPLEA1MSVLVSDGVHSVTAQCALRV 

TIITDEMLTHSITLRLEDMSPERFLSPLLGLFIQ 

AVAATLATPPDHVVVFNVQRDTDAPGGHILN 

VSLSVGQPPGPGGGPPFLPSEDLQERLYLNRS 

LLTAISAQRVLPFDDNICLREPCENYMRCVSV 

LRFDSSAPFIASSSVLFRPIHPVGGLRCRCPPGF 

TGD YCETE VDLCY SRPCGPHGRCRSREGGYT 

CLCRDGYTGEHCEVSARSGRCTPGVCKNGGT 

CVNLLVGGFKCDCPSGDFEKPYCQVTTRSFP 

AHSFITFRGLRQRFHFTLALSFATKERDGLLL 

YNGRFNEKHDFVALEVIQEQVQLTFSAGEST 

TTVSPFVPGGVSDGQWHTVQLKYYNKPLLG 

QTGLPQGPSEQKVAVVTVDGCDTGVALRFGS 

VLGNYSCAA\QGTQGGSKKSLDLTGPLLLGG 

VPDLPESFPVRMRQFVGCMRNLQVDSRHIDM 

ADFIANNGTVPGCPAKKNVCDSKTCHNGGTC 

VNQWDAFSCECPLGFGGKSCAQEMANPQHF 

LGSSLVAWHGLSLPISQPWYLSLMFRTRQAD 

GVLLQAITRGRSTITLQLREGHVMLSVEGTGL 

QASSLRLEPGRANDGDWHHAQLALGAIGGP 

GHA1LSFDYGQQRAEGNLGPRLHGLHLSNITV 

GGIPGPAGGVARGFRGCLQGVRVSDTPEGVN 

SLDPSHGESINVEQGCSLPDPCDSNPCPANSY 

CSNDWDSYSCSCDPGYYGDNCTNVCDLNPC 

EHQSVCTRKPSAPHGYTCECPPNYLGPYCET 

RIDQPCPRGWWGHPTCGPCNCDVSKGFDPDC 

NKTSGECHOCENHYRPPGSPTCLLCDCYPTG 

SLSRVCDPEDGQCPCKPGVIGRQCDRCDNPF 

AEVTTNGCEVNYDSCPRAIEAGIWWPRTRFG 

LPAAAPCPKGSFGTAVRHCDEHRGWLPPNLF 

NCTSITFSELKGFAERLQRNESGLDSGRSQQL 

ALLLRN ATQHTAGYFG SDVKVAYQLATRLL 

AHESTQRGFGLSATQDVHFTENLLRVGSALL 

DTANKRHWELIQQTEGGTAWLLQHYEAYAS 

ALAQNMRHTYLSPFTIVTPNIVISVVRLDKGN 

FAGAKLPRYEALRGEQPPDLETTVILPESVFR 

ETPPVVRPAGPGEAQEPEELARRQRRHPELSQ 

GEAVASVIIYRTLAGLLPHNYDPDKRSLRVPK 

RPIINTPWSISVHDDEELLPRALDKPVTVQFR 

LLETEERTKPICVFWNHSILVSGTGGWSARGC 

EWFRNESHVSCQCNHMTSFAVLMDVSRRE 

NGEILPLKTLTYVALG VTLAALLL TFFFLTLL 

RILRSNQHGIRRNLTAALGLAQLVFLLGINQA 

DLPFACTVIAILLHFLYLCTFSWALLEALHLY 

RALTEVRDVNTGPMRFYYMLGWGVPAFITG 

LAVGLDPEGYGNPDFCWLSIYDTLIWSFAGP 

VAFAVSMSVFLYILAARASCAAQRQGFEKKG 

PVSGLQPSFAVLLLLSATWLLALLSVNSDTLL 

FHYLFATCNC1QGPFIFLSYVVLSKEVRKALK 

LACSRKPSPDPALTTKSTLTSSYNCPSPYADG 

RLYQP\YGDSAGSLHSTSRSGKSQPSYIPFLLR 

EESALNPGNQGPPGLGGIPGR/LCFLGRFKDQQ 

H\DS*TRDFDSDLSLEDDQSGSYASTHSSDSEE 
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bbKl ID 

NO; of 
nucl- 
eotide 
seq- 
uence 


sty. 1JJ 
NO: of 
peptide 
seq- 
uence 


jviet 
hod 


IDNO: 
in 

USSN 
09/496 
914 


r reuICtea 

beginning 
nucleotide 
location 
correspond] 
ng to first 

urn in A flfiH 

Cli 1 1 11 1 vJ AUIU 

residue of 

peptide 

sequence 


rredictea ena 
nucleotide 
location 
corresponding 
to last amino 
acid Tesidue 

sequence 


Amino aciu sequence ^ a Alanine v^^/ysieine, 
D=Aspartic Acid, E=Glutaimc Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N-Asparagine, P=ProIine, 
Q=Glutamine, R=Afginine, S^Serine, 

T=TTir^n n in A \J—\fo \ir\p YA/^TYvntrrnri an 
i — i lii tuilJlic, v — valine, vt i iy yiv^tiicuif 

Y=Tyrosine, X=Unknown, *^Stop codon, 
/=possible nucleotide deletion. \=possible 
nucleotide insertion 














EEEEEEEEAAFPGEQGWDSLLGPGAERLPLHS 

TPKDGGPGPGKAPWPGDFGTTAKESSGNGAP 

EERLRENGDALSREGSLGPLPGSSAQPHKGIL 

KKKCLPTISEKSSLLRLPLEQCTGSSRGSSASE 

GSRGGPPSRPPPRQSLQEQLNGVMPIAMSIKA 

GTVDEDSSGSEJFLFFNFLH 


425 


1775 


A 


3429 


155 


1417 


GEPAVQSCDCGCTQRSCPWLLVAPGLLSSSSS 
RAASVREAEDAPLQPASIHPVSQGSRGPEGSL 
GSAECLPGDPLGARRATRAHSPVPGPPPSLPA 
AGTAVKRGLQPG *GA/GATSTPGTG AATGGL 

SLGCLPSWAS\PGTEHPPGPQGPGPS*DLCSV* 

KREFQRGPWAGMVE.HRISAADPARAPGPDS 

NLQSALQQPATGCSEPAAVYSPPIGLWGA**P 

EYG* PQHSLPG *TAPADR*FiAGIKDRVY SNSI 

YELLENGQRAGTCVLEYATPLQTLFAMSQYS 

QAGFSREDRLEQAKLFCRTLEDILADAPESQN 

NCRLIAYQEPADDSSFSLSQEVLRHLRQEEKE 

EVTVGSLKTSAVPSTSTMSQEPELLISGMEKP 

LPLRTDFS 


426 


1776 


A 


3431 


1662 


369 


AIWWLSWLQHDLLPTPTQVAIDFTASNGDPR 

SSQSLHCLSPRQPNHYLQALRAVGGICQDYD/ 

SVGESGAGGNRQGGLAQRIPQLFLLPSDKRFP 

AFGFGAR1PFNFEVG* MRGKEGDGGRVSQ AE 

JKj\Or HCoKLAL 1 UXaHDr AlNrUrbfSlrbLtvjJS. 

RGDFHLPRLPADTLHTGAQTPLPRAQLPVPST 

HPRPVFl\ElSGVIASYRRCLPQIQLYGPTN\ f AP 

UNRVAEPAQREQSTGQATKYSVLLVLTDGV 

VSD1VL\ETRTAIVRASRLPMSIIIVGVGNADFS 

DMRLLDGDDGPLRCPRGVPAARDIVQFVPFR 

DFKD VSPPGPFRLKDS S ASHPPKSDLRLPPFD 

VLLRTREPSWPP*SPTSPSDDPASPTLPLTPNHI 

TVPTLUAPSALAKCVLAEVPRQVVEYYASQ 

GISPGAPRPCTLATTPSPSP 


427 


1777 


A 


3446 . 


79 


9748 


GCQSCWPAWPRLRRRGPASAGARLGRKAPW 

GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 

ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQPPPP 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPAV AEEPLHRPKKEL SATKKDRVNHCLTIC 

ENIVAQSVRNSPEFQKLLGIAMELFLLCSDDA 

ESDVRMVADECLNKVIKALMDSNLPRLQLEL 

YKEIKKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KIMASFGNFANONEIKVLLKAFIANLKSSSPTI 

RRTAAGSAVS1CQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLLILGVLLTLRYLVPLLQQQV 

KDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 

TLHHTQHQDHNWTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVEL1 
Ar;^f5Qcr , cp\/i QPifriVfivvi i nrruPAi pnnc 

AOLrljooUox V L.M\JVV^lS.vJiv V LLUttliALE'L'lJo 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDHTEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDILSHSSSQVSAVPSDPAMDLNDG 

TQASSPISDSSQ'iTl'KGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEASEA 

FRNS SMALQQAHLLKNMSHCRQPSDSS VDKF 

VLRDEATEPGDQENKPCR1KGDIGQSTDDDS 

APLVHCVRLLSASFLLTGGKNVLVPDRDVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 



181 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
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seq- 
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SEQ ID 
NO: of 
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seq- 
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nucleotide 
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correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=PhenyIalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=G!utamine, R=Arginine, S-Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown s *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 



TTEYPEEQYVSDILNYIDHGDPQVRGATAILC 

GTLICSILSRSRFHVGDWMGT1RTLTGNTFSL 

ADCIPLLRKTLKDESSVTCKLACTAVRNCVM 

SLCSSSYSELGLQLIIDVLTLRNSSYWLVRTEL 

LETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 

LKLQERVLNNVVIHLLGDEDPRVRHVAAASL 

IRLVPKLFYKCDQGQADPWAVARDQSSVYL 

KLLMHETQPPSHFSVSTITRJYRGYNLLPSITD 

VTMENNLSRVIAAVSHELITSTTRALTFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTVGMATMILTLLSSAWFPLDLSAHQDAL 

ILAGNLLAASAPKSLRSSWASEEEANPAATK 

QEEVWPALGDRALVPMVEQLFSHLLKVINrC 

AHVLDDVAPGPAIKAALPSLTNPPSLSPIRRK 

GKEKEPGEQASVPLSPKKGSEASAASRQSDTS 

GPVTTSKSSSLGSFYHLPSYLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQILEL 

ATLQDIGKCVEEILGYLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

QRLGSSSVRPGLYHYCFMAPYTHFTQALADA 

SLRNMVQAEQENDTSGWFDVLQKVSTQLKT 

NLTS VTKNRADKN AI HNHI RLFEPL VI KALK Q 

YTTTTCVQLQKQVLDLLAQLVQLRVNYCLL 

DSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFF 

FLVLLSYERYHSKQIIGIPKIIQLCDG1MASGR 

KAVTHAIPALQPIVHDLFVLRGTNKADAGKE 

LETQKEVVVSMLLRLIQYHQVLEMF1LVLQQ 

CHKENEDK WKRLSRQ I ADIILPMLAKQQMHI 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLWISGILA1LRVLISQSTED 

IVLSRIQELSFSPYLISCTVINRLRDGDSTSTLE 

EHSEGKQIKNLPEETFSRFLLQLVGILLEDIVT 

KQLKVEMSEQQHTFYCQELGTLLMCLIHIFKS 

GMFRJRTT AAATRLFR SDGCGGSF YTLD SLNLR 

ARSMITTHP AL VLL W CQILLL VNHTD YR W W 

AEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLA 

AKLGMCNREIVRRGALILFCDYVCQNLHDSE 

HLTWLIVNHIQDLISL SHEPP VQDFISA VHRNS 

AASGLFIQAIQSRCENLSTPTMLKXTLQCLEGI 

HLSQSGAVLTLYVDRLLCTPFRVLARMVDIL 

ACRRVEMLL.\ANLQS SMAQLPMEELNRIQEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 

SQCWTRSDSALLEGAELVNRIPAEDMNAFM 

MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LWVSKLPSHLHLPPEKEKDIVKFWATLEAL 

SWHLIHEQIPLSLDLQAGLDCCCLALQLPGL 

WS WS STEFVTHACSLI YCVHFILEAVAVQPG 

EQLLSPERRTNTPKAI SEEEEEVDPNTQNPKYI 

TAACEMVAEMVESLQSVLALGHKRNSGVPA 

FLTPLLRNIIISLARLPLVNSYTRVPPLVWKLG 

WSPKPGGDFGT AFPEBP VEFLQEICEVFICEF1 YR 

INTLGWTSRTQFEETWATLLGVLVTQPLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSIIRGIVEQEIQAMVSKRENIATHHLYQAWD 

PVPSLSPATTGALISHEKLLLQINPERELGSMS 

YKLGQ VSIHS V WLGN SITPLREEEWDEEEEEE 
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location 
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Amino acio sequence Alanine v- - \-ysieine, 
I>=Asparlic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I==Isoleucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Prolme, 
Q=Glutamine, R=Arginine, S^Serine, 
T^hreonine, V= Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














ADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFL 

LELYSRWILPSSSARRTPAILISEVVRSLLWS 

DLFTERNQFELMYVTLTELRRVHPSEDEELAQ 

YI , VPATCK AAAVLGMDKA VAEPVSRLLESTL 

RSSHLPSRVGALHGILYVLECDLLDDTAKQLI 

PVISDYLLSNLKG1AHCVNIHSQQHVLVMCAT 

ESTPSIIYHCALRGLERLLLSEQLSRLDAESLV 
KLSVDRVNVHSPHRAMAALGLMLTCMYTG 
KEKVSPGRTSDPNPAAPDSESVIVAMERVSVL 
FDRIRKGFPCEARWAJULPQFLDDFFPPQDIM 

TGQSSMVRDWVMLSLSNFTQRAPVAMATWS 
LSCFFVSASTSPWVAAILPHVISRMGKLEQVD 
VNLFCLVATBFYRHQIEEELDRRAFQSVLEV 
VAAPGSPYHRLLTCLRNVHKVTTC 


428 


1778 


A 


3449 


3 


430 


NSRPSPSAALVEVLLRSGSTFPHTVSGGWAA 
WGPWSSCSRDCELGFRVRKRTCTNPEPRNGG 

LrCVvjlJAAbi l^yCNr yALr V JK.uA W oL W 1 o 

WSPCSASCGGGHYQRTRSCTSPAPSPGEDICL 
GLHTEEALCATQACPEGWS 


429 


1779 


A 


3464 


583 


3 


DALDRRYLERCHPAAGGWVGEGE*ALCQKT/ 

RFSGVLEPPLPSLKDGGRFPAWT*RSCSKSLR 

AAFTSQFFPSRRSRASPGSAPNGKGQNLTEQHP 

CPGSCDPQVLSASWM*VEHRSKFRPPP*NSTI 

PPES/RS* QGGTVQTGQHSSGREAGSWRARGR 

NAGRR*KGGGKJGTKQGAVRARKECRGEMA 

SGETDSE 


430 


1780 


A 


3473 


2802 


270 


FRMR1FLHCPWNQQMWK1WNLLETSLESCKA 

HLS1QKLLKER\Q\QLPVFKHRDSIVETLKRHR 

VVVVAGET\G SGKSTQVPHFLLEDLLLNE WE 

ASKCNIVCTQPRRISAVSLANRVCDELGCENG 

PGGRNSLCGYQIRMESRACESTRLLYCTTGV 

LLRKLQEDGLLSWVS/HMFIVDEV\HERVSVQS 

DFLLIILKEILQKRSDLHLILMSATVDSEKFST 

YFTHCPILRISGRSYPVEVFHLEDIIEETGFVLE 

KDSEYCQKPLEEEEEVTTNVTSKAGGIKKYQE 

YIPVQTGAHADLNPFYQKYSSRTQHAILYMN 

PHKINLDLILELLAYLDKSPQFRNIEGAVLIFL 

PGLAHIQQLYDLLSNDRRFYSERYKVIALHSI 

LSTQDQAAAFTLPPPGVRKIVLATNIAETGITI 

PDVVFVIDTGRTKENKYHESSQMSSLVETFVS 

KASALQRQGRAGRVRDGFCFRMYTRERFEG 

FMDYSVPEILRVPLEELCLHIMKCNLGSPEDF 

LSKALDPPQLQVISNAMNLLRKJGACELNEPK 

1 TP? nrvHT A AT PVTsIVT^WTIfAvfT TPflArRfW" 1 ! HP 

VATLAA VMTEKSPFT l'PIGRKDEADLAKSAL 
AMADSDHLTIYNAYLGWKKARQEGGYRSEI 
TYCRRNFLNRTSLLTLEDVKQELIKLVKAAGF 
SSSTTSTSWEGNRASQTLSFQEIALLKAVLVA 
OT YDNVGKIIYTKSVDVTFKT ACIVFTAOflK 

AQVHPSSVNRDLQTHGWLLYQEK1RYARVY 
LRETTLITPFPVLLFGGDIEVQHRERLLSIDGW 
IYFQAPVKIAVIFKQLRVLIDSVLRKKLENPK 
MSLENDKILQIITELIKTENN 


431 


1781 


A 


3474 


1 


441 


FRPAPGH VQP* GG SSAAAGGGLLSHPRPCQQ 
PCPPAPAPSRPRSLG SLGQRVP AALATAAQEL 
PATLGGDGGKPALTAGEAALPGLHRSGVPAA 
AARC*PCT/SRPT*STLSPTQAAWWCRPSRRQ 
QRGEASTGGASGRRCGSCFQV 
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correspondi 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, 
D=Aspartic Acid, E=<jlutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
QKjlutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


432 


1782 


A 


3478 


416 


23 


QLRRLTLPNFKTY/YSS*1IEIAWH**KNMQID 
QWFRRESPEIDLCKYS*LSFDKEAKAIK/WKJE 
CSLr^KWCTYKNWM/LHVQKXRI* VQTLI IPS 
QKLK\SKWIKDLNVECRITKLLDQEYPGDLGY 
SRALNSGSR 


433 


1783 


A 


3504 


1876 


552 


CLAPCSPQPEKNGMQPLLLLLPPLLYQQLLHS 

SLGAPGESTLLVRTSKLLVGLGLQLLVWLLL 

QTRSLLALQLHLTSSAPLLAAPTAVCSCSRCS 

APRSRC V ARPAARTGLPTPAP A S SP AP AASP A 

PAASPAPAESTA\PQPLILLPKP/PPAPGAPPPRP 

GAPPPRPAASPSPAASPAPPAASPVLTASPPLP 

AASPSPAASPAPPAASPVLTASPPLPAASPSPA 

ASPAPPAASFVLTASPPLPAASPALAASPVHT 

ASPPVHVASPPVHTASPPVHVASPPVHTASPP 

VHVASPPVHTASPHVHVASPPVHVASPPVHV 

ASPPVHTASPPVHVASPPVHTASPHVHVASPP 

VHTASPPVHVASPPVHVASPPVHVAYPPVHV 

ASPPVHVASPPVHVASPPVSCSGDSTSDCFPP 

QPGAVFPHSLAPSLGGWSHLVAALP 


434 


1784 


A 


3516 


142 


590 


GGVNRPRSETEQVKTPVLISSWDYRHPPPRPA 

SFFVFLV*TGF\TALARMVL1SWPCDLPTSASQ 

SAGITGVRHHA\RLLYFEQESHSVTQAGW\VQ 

WHNLGSLQPLSLEDRLSPGVLGCSALCRSGV 

RTKFGINMVTSRERGTTRLPKEG 


435 


1785 


A 

• 


3529 


1 


3161 


MSLVRAALEALDELDLFGVKGGPQSVIHVLA 

DEVQHCQSILNSLLPRASTSKEVDASLLSVVS 

FPAFAVEDSQLVELTKQEIITKLQGRYGCCRF 

LRDGYKTPKEDPNRLYY/ENPAELKJLFENIEC 

EWPLFWTYFILDGVFSGNAEQVQEYKEALEA 

VLDCGKNGVPLLPELYSVPPDRVDEEYQNPHT 

VDRVPMGKLPHMWGQSLYILGSLMAEGFLA 

PGEIDPLNRRFSTVPKPDVWQVYPSLPHGCS 

SKSPSHQCTIISIRTTRKITAPVSILAETEEIKTIL 

KDKGI Y VETI AEV YPIRVQPARJLSHI YS SLEIF 

LPFLNSVSGCNNRMKLSGRPYRHMGVLGTSK 

LYDIRKTIFTFTPQFIDQQQFYLALDNKMIVE 

MLRTDLSYLCSRWRMTGQPTITFPISHSMLDE 

DGTSLNSSILAALRKMQDGYFGGARVQTGKL 

SEFLTTSCCIHLSFMDPGPEGKLYSEDYDDN 

YDYLESGNWMNDYDSTSHARCGDEVARYL 

DHLLAHTAPHPKLAPTSQKGGLDRFQAAVQT 

TCDLMSLVTKAKELHVQNVHMYLPTKLFQA 

SRPSFNLLDSPHPRQENQVPSVRVEIHLPRDQ 

SGEVDFKALVLQLKETSSLQEQADILYMLYT 

MKGPDWNTELYNERSATVRELLTELYGKVG 

EIRHWGLIRYISGILRKKVEALDEACTDLLSH 

QKHLTVGLPPEPREKTISAPLPYEALTQLIDEA 

SEGDMSISILTQEIMVYLAMYMRTQPGLFAE 

MFRLRJGLIIQVMATELAHSLRCSAEEATEGL 

MNLSPSAMKNLLHHILSGKEFGVERSVRPTD 

SNVSPAISIHEIGAVGATKTERTG1MQLKSEIK 

QSPGTSMTPSSGSFPSAYDQQSSKDSRQGQW 

QRRRRLDGALNRVPVGFYQKVWKVLQKCH 

GLS VEGFVLPS STTREMTPGETKFS VHVESVL 

NRVPQPEYRQLLVEAIL\VLTMLADIEI\HSIGS 

IIAVEKIVHIANDLFLQEQKTLGADDTMLAKD 

PASGICTLLYDSAPSGRFGTMTYLSKAAATY 

VQEFLPHSICAMQ 


436 


1786 


A 


3546 


73 


393 


CP*LTWELLEVKKAEVLQDSLDGRYSTPSSCL 
EQPDSCRPYGRSFYALEEKHVIFSLDVGETDN 



184 



WO 01/57188 



PCT/US01/03800 



NO: of 

11UV.I- 

eotide 
seq- 
uence 


NO: of 

rxrtf idc 

seq- 
uence 


1V1CL 

hod 


ID NO: 

in 
in 

USSN 

09/496 

914 


i rcaicicu 
beginning 

11 Ul>lh<\J ■.lUw 

location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


rTeuiciea ena 
nucleotide 

lwLr<lllvll 

corresponding 
to last amino 
acid residue 
of peptide 
sequence 
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r riimiyiaiaixiiisif vj vjij r viii(« 7 r* rxi^uuiiiCj 

I=Isoleucine, K=Lysinc, L=Leucuie, 
M=Methionine, N=Asparagine, P=Proline, 
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KGKGKTIRGI*TFKGRKGGTYQREHDANPLA 
PXSARSCWMRKG 


437 


1787 


A 


3554 


5157 


2939 


AVRAEPGLEELS SGLRAHSPSATTVCEPEAQG 
SASGCRYAAHPHWGLGGAAAAGGSWEPQPP 
RFVCEPAGRGKPHPPAAPRSPLLPGSRRRPHA 
AQPGARARTSPPPASARNMAARPAATLAWSL 
LLLSSALLREGCRARFVAERDSEDDGEEPVVF 
PESPLQSPTVLVAVLARNAAHTLPHFLGCLER 
LDYPKSRMAIWAATDHNVDN1TEIFREWLK 
NVQRLYHYVEWRPMDEPESYPDEIGPKHWP 
TSRF AHVMKLRQ AALRTAREKW SD YILFIDV 
DNFLTKPQTLNLLIAENKTIVAPMLESRGLYS 
NFWCGITPKGFYKRTPDYWQJREWKRTGCFP 
VPMVHSTFLIDLRKEASDKLTFYPPHQDYTW 
TFDDIIVFAFS SRQAGIQMYLCNREHYGYLPIP 
LKPHQTLQEDIENLIHVQIEAMIDRPPMEPSQ 
i VjV VrKYrJJiUviOrlJliJrMlNLJSJKJ<^ 
RWLRTLYEQEIEVKIVEAVDGKALNTSQLKA 
LNIEMLPGYRDPYSSRPLTRGEIGCFLSHYSV 
WKEVIDRELEKTLVIEDDVRFEHQFKKKLMK 
LMDNIDQAQLDWELIYIGRXRMQVKEPEKA 

\rpXT\/ A"KTT VP A nVQVtt/TT nV\/?CT PP AWI \f 
V ri\ V AJN JL V E-AJJ loIWI L>\J I V loLifcioAV^JMj V 

GANPFGKMLPVDEFLPVMYNKHPVAEYKEY 
YESRDLKAFSAEPLLIYPTHYTGQPGYLSDTE 
TSTIWDNETVATDWDRTHAWKSRKQSRIYSN 
AKNTEALPPPTSLDTVPSRDEL 


438 


1788 


A 


3563 


130 


527 


IFFNSSSLFCRVFCLFLRWSFTLVAQARVQ*C 
NLSSLQPLPPGFK*FSCLSPPRS*DYRRPPPRPA 
NFLYF* *RQGFTVLGQ AGLELLT/S/GDPPTSA 
SQSAGITGVSHRAWPVHA1STHISLVKTRPSLT 
TLG 


439 


1789 

• 


A 


3565 


446 


1834 


LLQPAMRKSPGLSDCLWAWILLLSTLTGRSY 

GQPSLQDELKDNTTVFTRILDRLLDGYDNRL 

RPGLGERVTEVKTDIFVTSFGPVSDHDMEYTI 

DVFFRQSWKDERLKFKGPMTVLRLNNLMAS 

KIWTPDTFFHNGKKSVAHNMTMPNKLLRITE 

nnTT T VTTV/TPT TVl?\AFPP\/AF/TI?nFP\vfYn\AW 
1J\J 1 Ijij I 1 JVIKJj 1 V JKX/VC/V^r iYLAJr O.KXM* r Jvl VL/lA-M 

ACPLKFGSYAYTRAEWYEWTREPARSVW 
AEDGSRLNQYDLLGQTVDSGIVQSSTGEYW 
MTTHtHLKRKlGYFVIQTYLPCIMTVILSQVSF 
WT NURS VPARTVFOVTTVI TMTTT <!l<iARN<;r 

yiI/Inimjo vi novi vruY i i v v i ivi i i LiOIOAAJN jLr 

PKVAYATAMDWFIAVCYAFVFSALIEFATVN 

YFTKRGYAWDGKSVVPEKPKKVKDPLIKKN 

NTYAPTATSYTPNLARGDPGLATIAKSATIEP 

KEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLL 

FGIFNLVYWATYLNREPQLKAPTPHQ 


440 


1790 


A 


3568 


1 


350 


STSSCFPAAAAAIMREIVHLQAGQCGNQIGAK 
FWEVISDEHGIDPTGTYHGDSDLQLERINVYY 
NEATGEAPVPSPTALRGPRGPCLG*RPPVPAG 
GKYVPRAVLVDMEPGTMDSV 


441 


1791 


A 


3569 


2 


1751 


FVAVAGAVSGEPLVHWCTQQLRKTFGLDVS 

EEIIQYVLSIESAEEIREYVTDLLQGNEGKKGQ 

FIEELITKWQKNDQELISDPLQQCFKKDEILDG 

QKSGDHLKRGRKKGRNRQEVPAFTEPDTTAE 

VKTPFDMKAQENSNSVKKK'TKFVNLYTREG 

QDRLAVLLPGRHPCDCLGQKHKLINNCLICG 

RIVCEQEGSGPCLFCGTLVCTHEEQDILRGDS 

N\KSQKLLKKLMSGVENSGKVDISTKDLLPH 

QELRIKSGLEKAIKHKDKLLEFDRTSIRRTQVI 
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Amino acid sequence (A=Alanine C=Cysteine, 
jj==Asparuc aciq, ej— oiuiamic r\tno, 
F=Phenylalanine, G=Glycine, H=Histidine 3 
Msoleucine, K=Lysine, L=Leucine, 
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T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














DDESDYFASDSNQWLSKLERETLQKJREEELR 
m ouaqpt QVV\rrTnPAnR1fIT FFFlsI^I AFYH 

SRLDETIQAIANGTLNQPLTKLDRSSEEPLGVL 
VNPNMYQSPPQWVDHTGAASQKKAFRSSGF 
GLEFNSFQHQLRIQDQEFQEGFDGGWCLSVH 
QP WA SLL VRGIKRVEGRS WYTPHRGRL WI AA 
TAKKPSPQEV SELQ ATYRLLRGKDVEFPND Y 
PSGCLLGCVDLIDCLSQKQFKEQFPDISQESDS 
PFVHCKNPQEMWKFPIKGNPKIWKLDSKIH 
QGAKKGLMKQNKAV 


442 


1792 


A 


3576 


1 


2019 


MPRSHTGERLCEGKEGSQCAENFSPNLSVTK 

KTAGVKPYECTICGKAFMRLSSLTRHMRSHT 

A1RAI\EKPYKCKEC\GRAFSLSQILSK\HERSH 

TGEKPYKCKQCGKTFIYHQPFQRHERTHIGEK 

P YECKQCGKAL SCSS SLRVHERIHTGEKPYEC 

KQCGKAFSCSSSIRVHERTHTGEKPYACKVEC 

GKAFIS\TTSVLTHMITHNGDRPYKCKECGKA 

FIFPSFLRVHERIHTGEKPYKCKQCGKAFRWS 

TSIQIHERIHTGEKPYKCKECGKSFSARPAFRV 

HVRVHTGEKPYKCKECGKAFSRISYFRIHERT 

HTGEKPYECKKCGKTFNYPLDLKIHKRNHTG 

EKPYECKECAKTFISLENFRRHMITHTGDGPY 

KCRDCGKVFIFPSALRTHERTHTGEKPYECKQ 

CGKAF SCSS YlKIHrvK 1 ri 1 0-bKVr Y bCisJbtAjK 

AHYPTSFQGHMRMHTGEKPYKCKECGKAFS 

t iicccD\'i>tjrTT?iT-rKTV'Pi(rPi 'Pr % *C\\r i (~it^ AF^V^IT^ 
LHobrKVKii 1 KJrliN I iiivr.L.r.t~. ^\LUi\Af avaio 

LKKPMRNAQSDRKLY/KCEK+EKVFNSNRCF 
! QSCENSH*REKSCQCK*YRKRDTR*FMYSQV 
1 PHNHVSVSNGPYR/CGSPIRLYNT+NISINRNL 
I VAWTP*CSTLFKCLWCWCKRAALSVV*/IVQ 
I DSGRGRWLTPVIPALWEAKAGGSRGQEIKTIL 

ANTVKPHLY 


443 


1793 


A 


3578 


287 


114 


DFYERKFEQFEEGHKQIVNKWRDLLCSWKRK 
LSI1KKSVLQNNL+FSAASMRFQKVFF 


AAA 

III 


1794 


A 


3582 


3335 


1909 


HLFFSLFLAAMAMTGSTPCSSMSNHTKERVT 
MTKVTLENFYSNLIAQHEEREMRQKKLEKV 
MEEEGLKDEEKRLRRSAHARKETEFLRLKRT 
RLGLEDFESLKVIGRGAFGEVRLVQKKDTGH 
VYAMKJLRKADMLEKEQVGHIRAERDILVEA 
DSLWVVKMFYSFQDKLNLYLIMEFLPGGDM 
j MTLLMKKDTLTEEETQFYIAETVLAIDSIHQL 

! KAHRTEFYRNLNHSLPSDFTFQNMNSKRKAE 
! TWKRNRRQLAFSTVGTPDYIAPEVFMQTGYN 
! KLCDWWSLGVIMYEMLIGYPPFCSETPQETY 
! KKVMNWKJETLTFPPEVPISEKAKDLILRFCCE 
WEHR1GAPGVEEDCSNSFFEGVDWEHIRERPA 
A1SIEIKSIDDTSNFDEFPESDILKPTVATSNHPE 
j TDYKNKDWVFINYTYKRFEGLTARGAIPSYM 
j KAAK 


445 


1795 


A 


3584 


1 


6169 


| RTRGIEKRFAYSFLQQLIRYVDEAHQYILEFD 
[ GGSRGKGEHFPYEOEIKFFAKVVLPLIDQYFK 
NHRLYFLSAASRPLCSGGHASNKEKEMVTSL 
FCKLGVLVRHRISLFGNDATSIVNCLHILGQT 
LDARTVMKTGLESVKSALRAFLDNAAEDLE 
KTMENLKQGQFTHTRNQPKGVTQIINYTTVA 
LLPMLSSLFEHIGQHQFGEDLILEDVQVSCYRI 
LTSLYALGTSKSIYVERQRSALGECLAAFAGA 
FPVAFLETHLDKHNIYSIYNTKSSRERAALSLP 
TNVEDVCPNIPSLEKLMEEIVELAESGIRYTQ 
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Amino flciH opnupnrp / A~ A ljminp PrrpKctoin *± 

D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine s CMjIycine, H=Histidine, 
HIsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
^possible nucleotide deletion, ^possible 
nucleotide insertion 














MPHVMEVILPMLCSYMSRWWEHGPENNPER 

AEMCCTALNSEHMNTLLGNILKDYNNLGIDE 

GAWMKRLAVFSQPriNKVKPQLLKTHf-LPLM 

EKLKKKAATVVSEEDHLKAEARGDMSEAEL 

1J0.DEFTTLARDLYAFYPLLIRFGDYNRAKWL 

KEFNPEAEELFRMVAEVFIYWSKSHNFKREE 

QNFVVQNEINNMSFLITDTKSKMSKAAVSDQ 

ERKKMQCRKGDRYSMQTSLIVAALKRLLPIGL 

NICAPGDQELIALAKNRFSLKDTEDEVRDIIRS 

NIHLQGKLEDPAIRWQMALYKDLPNRTDDTS 

DPEKTVERVLDIANVLFHLEQKSKRVGRRHY 

CLVEHPQRSKKAVWHKLLSKQRKRAWACF 

RMAPLYNLPRHRAVNLFLQGYEKSWIETEEH 

YFEDKLIEDLAKPGAEPPEEDEGTKRVDPLHQ 

LILLFSRTALTEKCKLEEDFLYMAYADIMAKS 

CHDEEDDDGEEEVKSFEEKEMEKQKLLYQQ 

ARLHDRGAAEMVLQTISASKGETGPMVAAT 

LKLGIAILNGGNSTVQQKMLDYLKEKKDVGF 

FQSLAGLMQSCSVLDLNAFERQNKAEGLGM 

VTEEG SGEKVLQDDEFTCDLFRFLQLLCEGH 

NSDFQNYLRTQTGNNTTVNI II STVD YLLRVQ 

ESISDFYWYYSGKDVIDEQGQRNFSKAIQVA 

KQVFNTLTEYIQGPCTGNQQSLAHSRLWDAV 

VGFLHVFAHMQMKLSQDSSQIELLKELMDLQ 

KI)MVVMLLSMLEGNVVNGTIGKQMVDMLV 

ESSNNVEMILKFFDMFLKLKDLTSSDTFKEYD 

PDGKGVIFKRDFHKAMESHKHYTQSETEFLL 

SCAETDENETLDYEEFVKRFHEPAKDIGFNVA 

VLLTNLSEHMPNDTRLQTFLELAESVLNYFQP 

FLGRIEIMGSAKRIERVYFEISESSRTQWEKPQ 

VKESKRQFIFDWNEGGEKEKMELFVNFCED 

TIFEMQLAAQISESDLNERSANKEESEKERPEE 

QGPRMAFFSILTVRSALFALRYNILTLMRMLS 

LKSLKKQMKKVKKMTVKDMVTAFFSSYWSI 

FMTLLHFVASVFRGFFRJICSLLLGGSLVEGA 

KKIKVAELLANMPDPTQDEVRGDGEEGERKP 

LEAALPSEDLTDLKELTEESDLLSDIFGLDLKR 

EGGQYKLJPHNFNAGL SDLMSNPVPMPEVQE 

KFQEQKAKEEEKEEKEETKSEPEKAEGEDGE 

KEEKAKEDKGKQKLRQLHTHRYGEPEVPESA 

FWKKIIAYQQKLLNYFARNFYNMRMLALFV 

AFAINFILLFYKVSTSSVVEGKELPTRSSSENA 

KVTSLDSSSHRIIAVHYVT.FESSGYMEPTVRIL 

PILHTVISFFCIIGYYCLKVPLVIFKREKEVARK 

LEFDGLYITEQPSEDDIKGQWDRLVINTQSFP 

NNYWDKFVKRKVMDKYGEFYGRDRISELLG 

VKYQMWKLGWFTDNSFLYLAWYMTMSVL 

GHYV^FFFAAHLLDIAMGFKTLRTILSSVTH 

NGKQLVLTVGLLAVWYLYTWAFNFFRKF 

YNKSEDGDTPDMKCDDMLTCYMFHMYVGV 

RAGGGIGDEIEDPAGDEYEIYRIIFDITFFFFVI 

VILLAIIQGLHDAFGELRDQQEQVKEDMETKC 

F1CGIGNDYFDTVPHGFETHTLQEHNLANYLF 

FLMYLINKDETEHTGQESYVWBCMYQERCWE 

FFPAGDCFRKQYEDQLN 


446 


1796 


A 


3592 


1 


355 


AGLELLNSDDPPALASQSAGITGVTRTPSLFF* 
DTVLLCCSG WSAVAPSRLTAALFS* AQAVCL 
SLPRSWDYRRW/PPHPANFCIFCRDE/SUWML 
PRLVSNSWTQAILLPRPPKMLGLQV 
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Amino acid sequence (A-AJanine C-Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
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M=Methionine, N=Asparagine, P=Proline, 
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T=Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 


447 


1797 


A 


3598 


1202 


1070 


LFVGGGPICPEGASGFAPGPAPAPRVGVDAEV 
GR*V*GAAASQGA/GSLRPRPTGPGHPGAWL 
QV WG AAAVC AGPAM */AVRAKRGPRAG* EP 
NSPWRSGVLAA\RAVGAGPWP*P*PGCS*ARG 
PSSRSAPGLASGPAAPLLQGVHSSAGPLLCYI 
NGTL ALGLKP* * AWG WGEWRPKG 


448 


1798 


A 

• 


3604 


3115 


557 


FRRKGGGGPKDFGAGLKYNSRHEKVNGLEE 

GVEFLPVNNVKKVEKHGPGRWWLAAVLIG 

LLLVLLGTGFL V WHLQ YRDVRVQKVFNG Y M 

RITNENFVDAYENSNSTEFVSLASKVKDALICL 

LYSGVPFLGPYHKESAVTAFSEGSVIAYYWSE 

FSIPQHLVEEAERVMAEERWMLPPRARSLKS 

FWTS W AFPTD SKTVQRTQDN SCSFGLH AR 

GVELMRFTTPGFPDSPYPAHARCQWALRGD 

ADSVLSLTFRSFDLASCDERGRHLV\TVYNTVL 

SPMEPHA\L VQLCGT YPPS Y NLTFHS\S\QNVL 

LlTLiTNTERRHPG\FEATFFQLPRMSSCGGRL 

RKAQGTFN SP YYPGH YPPN IDCTWNIEVPNN 

QHVKVRFKFFYLLEPGVPAGTCPKDYVEING 

EKYCGERSQFVVTSNSNKITVRFHSDQSYTDT 

GFLAEYLSYDSSDPCPGQFTCRTGRCIRKELR 

CDGWADCTDHSDELNCSCDAGHQFTCKNKF 

CKPLFWVCDSLNDCGDNSDEQGCSCPVAQTF 

RCSNGKCLSKSQQCNGKDDCGDGSDEASCP 

KVNVVTCTKHTYRCLNGLCLSKGNPECDGK 

EDCSDGSDEKDCDCGLRSFTRQARWGGTD 

ADEGEWPWQVSLHALGQGHICGASUSPNWL 

VSAAHCYIDDRGFRYSDPTQWTAFLGLHDQS 

QRS APG VQERRLKRII SHPFFNDFTFD YDIALL 

ELEKP AE YS SM VRPI CLPDASH VFPAG KAI WV 

TGWGHTQYGGTGAL1LQKGEIRVINQTTCEN 

LLPQQITPRMMCVGFLSGGVDSCQGDSGGPL 

SSVEADGRJFQAGVVSWGDGCAQRNKPGVY 

TRLPLFRDW1KENTGV 


449 


1799 


A 


3618 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQ 

EMTRRPSLMAGRQHGWSAQQSATVANPVPG 

ANPDLLPHFLGEPEDVYIVKNKPVLLVCKAV 

PATQIFFKCNGEWVRQVDHVIERSTDGSSGLP 

TMEVRINV SRQQVEKVFGLEE Y WCQCVA WS 

SSGTTKSQKAY1RIAYLRKNFEQEPLAKEVSL 

EQGIVLPCRPPEGIPPAE 


450 


1800 


A 


3620 


1 


2676 


MEPSLGQGMDLTCPFGVSPACGAQASWSIFG 

ADAAEVPGTRGHSQQEAAMPHIPEDEEPPGE 

PQAAQSPAGQQGPPTAGVSCSPTPTIVLTGDA 

TSPEGETDKNLANRVHSPHKRLSHRHLKVST 

ASLTSVDPAGHIIDLVNDQLPDISISEEDKKKN 

LALLEEAKLVSERFLTRRGRKSRSSPGDSPSA 

VSPNLSPSASPTSSRSNSLTVPTPPEGDEADVS 

SPHPGEPNVPKGLADRKQNDQRKVSQGRLAP 

RPPPVEKSKE1AIEQKENFDPL0YPETTPKGLA 

PVTN SSGKMALNSPQPGPVESELGKQLLKTG 

WEGSPLPRSPTQDAAGVGPPASQGRGPAGEP 

MGPEAGSKAELPPTVSRPPLLRGLSWDSGPEE 

PGPRLQKVLAKLPLAEEEKRFAGKAGGKLAK 

APGLKDFQIQVQPVRMQKLTKLREEHILMRN 

QNLVGLKLPDL SEAAEQEKGLPSELSPAIEEE 

ESKSGLDVMPNISDVLLRKLRVHRSLPGSAPP 

LTEKEVEN VFVQLS S A FRND S YTLESRINQ AE 

RERNLTEENTEKELENFKASITSSASLWHHCE 

HRET YQKLLEDI AVLHRLAARLS SRAEWGA 
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Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid. E=Giutamic Acid, 
F=PhenyIaIanine, G=GIycine, H^Histidine, 
Msoleucine, K=Lysine, L -Leucine, 
M=Methionine. N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine : V= Valine, W=Tryptophan, 
Y= Tyrosine, X=Unknown, +^Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














VRQEKRMSKATEVMMQYVENLKRTYEKDH 
AELMEFKKLANQNSSRSCGPSEDGVLRTARS 
M S LTLGKNM PRRR V S V A V VP KFN ALNLPG Q 

TPSSSSIPSLPALSESPNGKGSLPVTSALPALLE 

NGKTNGDPDCEASAPALTLSCLEELSQETKA 

RMEEEAYSKGFQEGLKKTKELQDLKEEEEEQ 

KSESPEEPEEVEETEEEEKDPRSSKLEELVHFL 

QVMYPKXCQHWQV1WMMAAVMLVLTVVL 

GLYNSYNSCAEQADGPLGRSTCSAAQKDSW 

WSSGLQHEQPTEQ 


451 


1801 


A 


3623 


504 


198 


QLIQHQTVHTGRKLYECKECGKAFNQGSTLI 
RHQRIHTGEKPYECXVCGKAFRVSSQLKQHQ 
RIHTGERPYQCKELKGRGAEMLAVLAVKEQ 
NRTPVNYGK 


452 


1802 


A 


3628 


2 


195 


MTCLHSAKAFHY* SSCSFSCEEGFAL1GPEV V 

QCTALGVWTAPAPVCIAVQCQHLEALNEGT 

MG*DYPFTAFAYGSSCKYECHTVYRVRGLD 

MLHSRGCYLWNGHFTT*EAISCEPLERPCH*S 

V*CSFSCEEGFALIGPEWQCTALGVWTAPAP 

VCIAVQCQHLEALNEGTMG 


453 


1803 


A 


3637 


662 


142 


IQAKGLGIWHVPNKSPMQHWR\KGSLLRYRT 

DTGFLQTLGHNLLGIYQKYPVKYGEGKCWT 

DNGPVIPVVYDFGDAQKTASYYSPYGQREFT 

AGFVQFRVFNNERAANALCAGMRVTGCNTE 

HHCIGGGGYFPEASPQQCGDFSGFDWSGYGT 
\H VG YS S SREIT E\AA VLLFYR 


454 


1804 


A 


3641 


1 


362 


TQVHPAMLGLDELGRSGCXjHCTQADLRFGD 
AAGRDPGQDNDRNTAEPAFPPPPRVMAAAA 
ALRAPAQSSVTFEDVAVNFSLEEWSLLNEAQ 
GCLYHDVMLETLTLISSLGKVLILNCDLS 


455 


1805 


A 


3646 


2 


414 


AAAGRGASGALTGEGGGEQGRRVGLGSRAH 
SLLLGPTFNSCQVS SQPPRVAGLGLPLKHEPS 
RPQPPSPRGPRTVRAGVPGAHPQDTPCPEFVR 
PRKVPLVGEAPGLPPEERSRGWRRDTPGLQE 
SRVRAPSYDDIT 


456 


1806 


A 


3656 


396 


8 


QIVSmSYLTLYTKNNLKSMKDLNVNTEMIK 
LLELKNIHNLG*AKFFLN*IQKALIKItKILIHW 
P/LIKIK/SFCSLSDTIKKMKRQTiVWEQTFIIHI 
SVKELVSRIYEAFLQFNKTVNRPVFDIKKEQK 
F 


457 


1807 


A 


3660 


14 


1961 


SEAKLGGPTGMDLWQLLLTLALAGSSDAFSG 

SEATAAILSRAPWSLQSVNPGLKTNSSKEPKF 

TKCRSPERETFSCHWTDEVHHGTKNLGPIQLF 

YTRRNTQEWTQEWKECPDYVSAGENSCYFN 

SSFTSIWffYCIKLTSNGGTVDEKCFSVDEIVQ 

PDPPIALNWTLLNVSLTGIHADIQVRWEAPRN 

ADIQKGWMVLEYELQYKEVNETKWKMMDP 

ILTTSVPVYSLKVDKEYEVRVRSKQRNSGNY 

GEFSEVLYVTLPQMSQFTCEEDFYFPWLLIIIF 

GIFGLTVMLFVFLFSKQQRIKMLILPPVPVPKI 

KGIDPDLLKEGKLEEVNT1LAIHDSYKPEFHS 

DDSWVEF1ELDIDEPDEKTEESDTDRLLSSDH 

EKLHINLGVKDGDSGRTSCCEPDILETDFNAH 

DIHEGTSEVAQPQRLKGEADLLCLDQKNQNN 

SPYHDACPATQQPSVIQAEKNKPQPLPTEGAE 

STHQAAHIQLSNPS SLSNIDFY AQ VSDITP AGS 

VVLSPGQKNKAGMSQCDMHPEMVSLCQENF 

LMDNAYFCEADAKKCIPVAPHIKVESH1QP\S 

LNQEDIYITTESLT\TAAGSP\GTGEHVPGSEM 
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ng to first 
amino acid 
residue of 
peptide 
sequence 



154 



3664 



902 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



462 



135 



3670 



850 



3671 



3672 



3673 



3676 



3679 



2472 



394 



348 



2253 



557 



2099 



110 



320 



8 



803 



Amino acid sequence (A=AIanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=OIycine, H=Histidine, 
Wsoleucine, K=Lysine. L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=GIutamine, R=Arginine, S*=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *-Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 



PVPDYTSIHIVQSPQGLILNATALPLPDKEFLS 
SCGYVSTPQLNKIMP 



TRAPASGRSGAGLALSANAPDSGGHPGATEG 
PAG SLAHASGSARGTWRVRGRGSHG WERT V 
GAGGCANPVPALHSCASAPRGTGRVSALGPK 
TGSSPLSSPKG 



LGKYNTSMALFDFVLHNSTGEIRY1TEDDVIQ 
SQNALGKYNTSMALFESNSFEKTILESPYYVD 
LNQTLFVQVSLHTSDPNLWFLDTCRASPTSD 
FASPTYDLIKSGCSRDETCKNVYPLFGHYGRF 
QFNAFKFLRSMSSVYLQCKVLICDSSDHQSRC 
\NQGC VSRSKRD1 SSYKWKTDSI1GPIRLKRDR 
SA\N GNSGFQHETHAEETPNQPFNSVHLFSFM 
VLALNWTVATITVRHFVNQRADYQ\YQKLQ 
NY 



LGILMSPQVEAGEI*ALLTPPPGCMQFSPLTL/P 
K* WVSPGLTP/PPPEVPS VFLVEPGLPHAGQA 
GLDLLMSGDPPASTSQSARTTDVSHRAQPLAI 



- 



IGVLAFETGSCSVTRLYCIG1IMPHCSLDLAGS\ 
TSAFR1AGTTSVHHHPQLTFFFFWIETGSHCV 
VQTGL+LLALSNPPALASQ1AGISGMSHRAWP 
GLVLYSLEFSLLCASQSLIMLFTCYNE 



VKP VNGESKRD* GADTQTCEGEADEQLQTVN 

CYYD/STKSFFYISCG*K\RKPTWAENRRLNA 

KMFG1PLHSNSDPWGYEEREVIGFHRSRVSRG 

HGS 



QRNPFSAGHPQRPPTSGSQSELLAQPRLRPGR 

KSSFSRDQDVW* SQAVPKRQ*QRNPFSAGHP 

QRPPTSGSQSELLAQPRLRPGRKSSFSRDQDV 

WPGQKPRPSQQQHQMCASPTLGQRSPFALEP 

VPAYHGGRDPFASARPSPVG1PKPRAAPAGG 

GWRRIRPKSSTK 



PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSR 

KVFQLLPSFPTLTRSKSHESQLGNRIDDVSSM 

RFDLSHGSPQMVRRDIGLSVTHRFSTKSWLS 

QVCHVCQKSMIFGVKCKHCRLKCHNKCTKE 

APACRIS FLPLTRLRRTBS VPSDINNP VDRAAE 

PHFGTLPKALTKXEHPPAMNHLDSSSNPSSTT 

FSTPSSPAPFPTSSNPSSATTPP\NPSP\GQR\DSR 

FNFPSC/A YFEHHR\Q\QFIFPDI S AFAHAAPLPE 

AADGTRLDDQPKADVLEAHEAEAEEPEAGK 

SEAEDDEDEVDDLPSSRRPWRGPISRKASQTS 

VYLQEWDIPFEQVELGEPIGQGRWGRVHRGR 

WHGEVAIRLLEMDGHNQDHLKJLFKKEVMN 

YRQTRHENVVLFMGACMNPPHLAIITSFCKG 

RTLHSFVRDPKTSLDTNKTRQ1AQE11KGMGY 

LHAKGIVHKDLKSRNVFYDNG\KVV1TDFGLF 

\GISGWP\EGRRENQLKLSHDWLCYLAPEIVR 

EMTPGKDEDQLPFSKAADVYAFGTVWY ELQ 

ARDWPLKNQAAEASIWQIGSGEGMKRVLTS 

VSLGKEVSENLSACWAFDLQERPS\FSLLMD 

MLEKJLPKLNRRLSHPGHF* KS ADIN SSK VVPR 

FERFGLGVLESSNPKM 



IPSPAWWNSTWADTFSLLLALAVALYLGYY 
WACVLQTHRAFCASNTEDLETWNHIKHRYP 
QAPLLAVGISFGGILVLNHLAQARQAAGLVA 
ALTLSACWDSFETTRSLETPLNSLLFNQPLTA 
GLCQLVERLS Y/E* DLQARTIRQFDER YTS VA 
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Ammo acid sequence (A- Alanine C^ysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F=Phenyl alanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine. L=Leucine, 
M^Methionine, N=Asparagine, P^ProIine, 
Q=GVutamine, R^Argimne, S^Serine, 
T='nireonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *-Stop codon, 
/^possioie nucleotide detenon, v^possiDie 
nucleotide insertion 














FGYQDCVTYYKAASPRTKIDA1RIPVLYLSAA 
DDPFSTVCALPKQAAQHSPYVALLITARGGHI 
GFLEGLLPWQHWYMSRLLHQYAKAIFQDPE 
GLPDLRALLPSEDRNS 


466 


1816 


A 


3684 


3 


307 


SSQYIVQSKTKIFL* AAREKQ/RHTCRRFSIRLS 
ANISSQTGEARGQWPSVFKVLKEKKLSTKKS 
FGQK* GR\RKTFPDKQK/LREFDTTRPTIQEML 
TGVLQG 


467 


1817 


A 

• 


3687 


2465 


837 


ELPTPLIAAHQLYNYVADHASSYHMKPLRMA 

RPGGPEHNEYALVSAWHSSGSYLDSEGLRHQ 

DDFDVSLLVCHCAAPFEEQGEAERHVLRLQF 

FVVLTSQRELFPRLTADMRRFRKPPRLPPEPE 

APGSSAGSPGEASGLILAPGPAPLFPPLAAEVG 

MARARLAQLVRLAGGHCRRDTLWKRLFLLE 

PPGPDRLRLGGRLALAELEELLEAVHAKSIGD 

IDPQLDCFLSMTVSWYQSLIKVLLSRFPQSCR 

HFQSPDLGTQYLVVLNQKFTDCFVLVFLDSH 

LGKTSLTVVFREPFPVQPQDSESPPAQLVSTY 

HHLESVINTACFTLWTRLL*GSGLDH*rvISLFL 

ESW AYQI ACQRQD* PALLGPRASQTLSDTKG 

F VTMS * GS AAPAWQQEPPSPNTHSH* PI QDSR 

ESGQPRGPLGPFWGTPFGPPGRVSGVHTGWQ 

TPPRAPLPESCPUPLTTVSHLCPLSLRVFTSHL 

DITAGHSHRDDTWVPIPALPLKHLRPPSSPFA 

LGPWVSHPLMRWVQKLSHLHSNPGTGFSMG 

GKQQRN 


468 


1818 


A 


3691 


960 


499 


QTCRKDKRAIYPHFQNE*MNEIKAI*SGTGGI 
QCFHSQNDSAFFFFLFLLETEFCSAA/TVQWH 
DFL SMQPPPPGFKQFTCLSLLSS WN YRR\PPPF 
PGNR*FLVKTGFPHVGQTGFELLTSSDLAPLA 
SQNGGITGMSPCAWPFFFFFFFGLC 


469 


1819 


A 


3714 


4747 


495 


MAYSWQTDFNPNESHEKQYEHQEFLFVNQP 

HSSSQVSLGFDQIVDE1SGKIPHYESEIDENTFF 

VPTAPKWDSTGHSLNEAHQISLNEFTSKSREL 

SWHQVSKAPAIGFSPSVLPKPQNTNKECSWG 

SPIGKHHGADDSRFSILAPSFTSLDKINLEKEL 

ENENHNYHIGFESSIPPTNSSFSSDFMPKEENK 

RSGHVNIVEPSLMLLKGSLQPGMWESTWQK 

NIESIGCSIQLVEVPQSSNTSLASFCNKVKKIR 

ERYHAADVNFNSGKIWSTTTAFPYQLFSKTK 

FNIHIFIDNSTQPLHFMPCANYLVKDLIAEILH 

FCTNDQLLPKDHILSVWGSEEFLQNDHCLGS 

HKMFQKDKSVIQLHLQKSREAPGKLSRKHEE 

DHSQFYLNQLLEFMH1WKVSRQCLLTLIRKY 

DFHLKYLLKTQENVYNIIEEVKKICSVLGCVE 

TKQITDAVNELSLILQRKGENFYQSSETSAKG 

LIEKVTTELSTSIYQLINVYCNSFYADFQPVNV 

PRCTSYLNPGLPSHLSFTVYAAHNIPETWVHR 

INFPLEIKSLPRESMLTVKLFGIACATNNANLL 

AWTCLPLFPKEKSILGSMLFSMTLQSEPPVEM 

ITPGVWDVSQPSPVTLQIDFPATGWEYMKPD 

SEENRSNLEEPLKECIKHIARLSQKQTPLLLSE 

EKXRYLWFYRFYCNNENCSLPLVLGSAPGW 

DERTVS EMHTILRRWTFSQPLEALGLLTSSFP 

DQELRKVAVQQLDNLLNDELLEYLPQLVQAV 

KFEWNLESPLVQELLHRSLQSIQVAHRLYAVL 

LKNAENEAYFKSWYQKLLAALQFCAGKALN 

DEFSKEQKLIKILGDIGERVfcSASDHQRQEVL 

KKEIGRLEEFFQDVNTCHLPLNPALCIKGIDH 

DACS YFTSNALPLKITFIN ANLMGKNI SI IFKA 
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Amino acid sequence (A=Alanine OCysteine, 
D= As parti c Acid, E=Glutamic Acid, 
F=Phenylalanine s G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Ghitamine 5 R^Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=Unlcnown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 






I 








gddlrqdmlvlqliqvmdniwlqegldmq 

miiyrclstgkdqrlvqmvpdavtlakihrh 

sgligplkentuckwfsqhnhlkadyekalr 

nffyscagwcvvtfilgvcdrhndnimltks 

ghmfhidfgkflghaqtfggikrdrapf1fts 

em\eyfitegg\knpqhfqdfv\elccrayniir 

khsqlll\nll^£mmlyag\lpelsgi\qdlky 

vy>jnlrpqdtoleatsh ftkkikeslecfp vk 

lnnlihtlaqmsaispakstsqtfpqescllst 

trsieratilgfskkssnlyliqvthsnnetsl 

teksfeqfsklhsqlqkqfasltlpefphww 

hlpftnsdhrrfrdlnhymeqilnvshev™ 

sdcvlsfflseagqqtveesspvylgekfpdk 

kpkvqlvisyedvkltilvkhmknihlpdgsa 

PSAHVEFYLLPYPSEVRRRKTKSVPKCTDPTY 
NE1VVYDEVTELQGHVLMLIVKSKTVFVGA1 
N1RLCSVPLDKEKWYPLGNSH*PLLLFSSFGM 
KSLEKDEFVGGMLLSNPIW 






A 


3718 

J / lu 


430 


75 


SHGSIS1LNLHQGCVFLPSLPAQGLRCYRCLA 
VLEGASCSVVSCPFLDGVCVSQKVSV/CWQ*/ 
CPWGARAEGRLSAWDSQ1SCCKGDLCNAV 
VLAAGSPWALCVQLLLSLGSVFLWALL 


471 


1821 


A 


3723 


891 


494 


LRQSL/NSVPQAGVQWRDSSLQAPPPRFTPLS 
CLSLPSSWDYRRLPPCLANFLYF**RRGFTML 
ARMVL1S* PRDPPASASQ\STEITGGSHRAQHP 
TDSRDHSERSVKKSHEV1SELRMKV1KCKVAF 
SKNPI 


472 


1822 


A 


3734 


443 


251 


GF1ET*MFCVSKDTSKKLS/RLPTKWKNVFAN 
* I SDKGL VSRICQELLRHLDAEQ VS ST AGL SL 


473 


1823 


A 


3746 


3 


500 


TH AS GG ARSG AG W AGRG VRAGTEAG RG GIF 

LTLS1LRTRDLPSGAMSEGVDL1DIYADEEFNQ 

DPEFNNTDQ1DLYDD VLTATSQPSDDRS SSTE 

PPPPVRQEPSPKPNNKTPAELYTYSGLRNRRA 

AVYVGSFSWWTTDQQLIQVIRSIGVYDVGEV 

KFAENRAK 


474 


1824 


A 


3753 


2 


5262 


RPLFAREGGIYAVLVCMQEYKTSV\LVQQAG 

LAALKMLAVASSSE1PTFVTGRDSIHSLFDAQ 

MTRE1FASIDSATRPGSESLLLTVPAAVILMLN 

TEGCSSAARNGLLLLNLLLCNHHTLGDQIITQ 

ELRDTLFRHSG1APRTEPMPTTRT1LMMLLNR 

YSEPPGSPVERAALETPUQGQDGSPELLIRSLV 

GGPSAELLLDLERVLCREGSPGGAVRPLLKRL 

QQETQPFLLLLRTLDAPGPNKTLLLSVLRV1T 

RLLDFPEAMVLPWHEVLEPCLNCLSGPSSDSE 

I VQELT CFLHRL A SMHKDYA WL CCLG AKE1 

LSKVLDKHSAQLLLGCELRDLVTECEKYAQL 

YSNLTSS1LAGCIQMVLGQIEDHRRTHQPINIP 

FFDVFLRHLCQGSSVEVKEDKCWEKVEVSSN 

PHRASKLTDHNPKTYWESNGSTGSHYITLHM 

HRGVLVRQLTLLVASEDSSYMPARVVVFGG 

DSTSCIGTELNTVNVMPSASRVILLENLNRFW 

PIIQIRIKRCQQGGIDTRVRGVEVLGPKPTFWP 

LFREQLCRRTCLFYTIRAQAWSRD1AEDHRRL 

LQLCPRLNRVLRHEQNFADRFLPDDEAAQAL 

GKTCWEALVSPLVQNITSPDAEGVSALGWLL 

DQYLEQRETSRNPLSRAASFASRVRRLCHLL 

VHVEPPPGPSPEPSTRPFSKNSKGRDRSPAPSP 

VLPSSSLRKrTQCWLSWQEQVSRFLAAAWR 

AFDFVPRYCKLYEHLQRAGSELFGPRAAFML 
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Amino ariH ^winpnrf* ( A—A\stninp C=r*irc+*tnt* 
D=Aspartic Acid, E=Glutamic Acid, 
F=PhenyJalanine, G=Glyrine, H=Histidine, 
I^Isoleucine, K=Lysine. L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^VaJine, W=Tryptophan, 
Y=Tyrosine, X=Unknown. *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














ALRSGFSGALLQQSFLTAAHMSEQFARYIDQ 

QIQGGLIGGAPGVEMLGQLQRHLEPJMVLSG 

LELATTFEHFYQHYMADRLLSFGSSWLEGAV 

LEQIGLCFPNRLPQLMLQSLSTSEELQRQFHLF 

QLQRLDKLFLEQEDEEEKRL*EEEEEEEEEEA 

EKELFIEDPSPAJSILVLSPRCWPVSPLCYLYHP 

RKCLPTEFCDALDRFSSFYSQSQNHPVLDMG 

PHRRLQWTWLGRAELQFGKQILHVSTVQMW 

LLLKFNQTEEVSVETLLKDSDLSPELLLQALV 

PLTSGNGPLTLHEGQDFPHGGVLRLHEPGPQ 

RSGEALWLIPPQAYLNVEKDEGRTLEQKRNL 

LSCLLVRILICAHGEKGLHIDQLVCLVLEAWQ 

KGPNPPGTLGHTVAGGVACTSTD VL SCILHLL 

GQGYVKRRDDRPQILMYAAPEPMGPCRGQA 

DVPFCGSQSETSKPSPEAVATX A SLQLPAGRT 

MSPQEVEGLMKQTVRQVQETLNLEPDVAQH 

LLAHSHWGAEQLLQSYSEDPEPLLLAAGLCV 

HQAQAVPVRPDHCPVCVSPLGCDDDLPSLCC 

MHYCCKSCWNEYLTTRIEQNLVLNCTCPIAD 

CPAQPTGAFIRAIVSSPEVISKYEKALLRGYVE 

SCSNLTWCTNPQGCDRILCRQGLGCGTTCSK 

CGWASCFNCSFPEAHYPASCGHMSQWVDDG 

kj i I lajjvio Vl^V^oJxriljAJE^loKKUrot^QArlil 

KNEGCLHMTCAKCNHGFCWRCLKSWKFNH 

KDYYNCSAMVSKAARQEKRFQDYNERCTFH 

HQAREFAVNLRNRVSAIHEVPPPRSFTFLNDA 

CQGLEQARKVL AYACVYSFY SQDAEYMDV V 

EQQTENLELHTNALQJQLLEETLLRCRDLASSL 

RLLRADCLSTGMELLRRIQERLLAILQHSAQD 

FRVGLQSPSVEAWEAKGPNMPGSQPQASSGP 

EAEEEEEDDEDDVPEWQQDEFDEELDNDSFS 

YDESENLDQETFFFGDEEEDEDEAYD 


475 


1825 


A 


3754 


1093 


96 


GTSRNQHSPKTHA*RSS/WPQPPPLFLPPLQPQ 

ATGRRR R R TR TOOUT A A T I TTlCYTTYTfl A A \\! 

SRRPSLCWPSRTTGAPGAK+AVLVRSATPTTN 
PPNPQSPTGAAGKLRAPGNRAG/SEPSSQEPPP 
DGTR\RJPASITGVAQSPATRATPSLPCLHVPAP 
SRGOTLGVRTTGRASRLTVDRSRI SWPr T R9A 

RSGGGR WRPN APRGRWPRAP* SWEPGSWTE 

PWRWPFPAAESPPHRCIYCTNHVSPAGPARPS 
HVYQRATINSISHPLCRAOSSPWEAAGVWRR 

PAQPAPTSDVNINLLRKPRVKRHDLIYQFLGN 
TLWEEGRQRPPETLQPAR 


476 


1826 


A 


3758 


901 


521 


FFFGNGVSPCPQAGV* WHDLDSLQNLPPGFK 
RFSYLSLPSSW\DYRHVPPRQANFCff/M*RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
APPQMDFTFALLCFAPKGCLPRQKEGGTLNLI 


477 


1827 


A 


3761 


843 


575 


GVISAHCNLRL/CHLPGSSNSPASASQVAGTIG ' 

ARTTPS*IFVFLVETGFHHVSQDGLDLL/NFVI 

RPRRPLKVLGLQACTRARLPSPLKEL 


478 


1828 


A 


3763 


267 


1240 


HLLSFHLWSASLDCLEQLSQERHVKGMLLGP 

PPVNESTKPSPSPWKLTPPMCSIPPVFPPKSGS 

PTTSWS/PSGHSKLEVERAQTGPFCLHIYCP*P 

GVTDNTTSLLHYIPFPRL\SGLVCFPAH*FPSY 

WTGHSFASQAWLRQVPEVSKHLQCPSAESLL 

TMEYHQPEDPAPGKAGTAEAV1PENHEVLAG 

PDEHPQDTDARD ADG E AREREP/RRPS F AA * P 

VWGQPVESPLPEASSAPPGPTLGTLPEVETIRA 
CSMPQELP* SPRTRQPEPDFYC VK WIPWKGE 
QTPUTQSTNGPLPSPCHHEHPLSSVEGEAPPA 
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Amino acid sequence (A=Alanine C=Cysteine, 

r\ — Acnar+i/* AriA P=frliitanfiic Arid 
P=P"hpnvlfilnninp fr^Cilvcine H=Histidine. 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Pro1ine, 
Q=Glutamine, R=Argmine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














EGSDHIG 


479 


1829 


A 


3766 


2 


2152 


YSP1RLLEVCVPLPKIF1KJIQAPLKVSLLQDLK 

DFFQKVSQVYVA1DERLASLKTDTFSKTREEK 

MEDIFAQKEMEEGEFKNWIEKMQARLMSSS 

VDTPQQLQSVFESLIAKKQSLCEVLQAWNNR 

LQDLFQQEKGRKRPSVPPSPGRLRQGEESK1S 

AMDASPRNISPGLQNGEKEDRFLTTLSSQSST 

SSTHLQLPTPPEVMSEQSVGGPPELDTASSSE 

DVFDGHLLGSTDSQVKEKSTMKAIFANLLPG 

NSYNPIPFPFDPDKHYLMYEHERVPIAVCEKE 

PSSIIAFALSCKEYRNALEELSKATQWNSAEE 

GLPl'NSTSDSRPKSSSPIRLPEMSGGQTNRTTE 

TEPQPTKKA SGMLSFFRGTAGKSPDLS SQKRE 

TLRGADSAYYQVGQTGKEGTENQGVEPQDE 

VDGGDTQKKQLINPHVELQFSDANAKFYCRL 

ARGGKSGAAFYATEDDRFILKQMPRLEVQSF 
LDFAPHYFNYIINAVQQKRPTALAK1LGVYRI 

VFDLKGSLRNRNVKTDTGKESCDWLLDENL 
LKMVRDNPLYIRSHSKAVLRTSIHSDSHFLSS 
HLIID Y SLLVGRDDTSNEL V VGIID YIRTFTWD 
w\ PA/i\/vTcr<!Tr;ii nnnn*TUPTVVSPFT yrtr 

NJVLCiVJ V V rvo l vj iiajvj V^vj lvlt i v v ljk, i x\. i ix. 

FCEAMDNYFLMVPDHCTGLGLNC 


480 


1830 


A 


3777 


251 


3 


QGCGSAGTLIHY**ECKMVQLLWKTV*QFLI 
KLNI\KDPAITLDVYPNEVKNYVRTKTYTQMF 

1 / A XTFTIv/f A K <1 OPTHP S VRT 


481 


1831 


A 


3779 


333 


— . 

3 


EAAIRQPEPNILDVNQIFKDLAMIIHDQGDLID 
SIE ANAES SEVL VERAPGQLQRP A\ YYQKKSR 
KKMCLVVLVQTAIILICERIM*WYTTKWSPPI 

VLPVSCFQGQKFN 


482 


1832 


A 


3780 


2 


371 


PNQDMKSSSNSLIIRKVQIKPTILYHH1FTRKA 
KMKTTDKTKYR*GFKAITTLIHCSQDCKXQ*S 
ft* ENHFM 1FPKAEQHIT YDTTIPFLR 


483 


1833 


A 


3787 


43 


448 


LMKDLSPYVMETHYILNRLNER/RSMWRHIIG 
KLPNTKDQEKILKA1RGRRE VIQG S/RQQYRR 
PAAFSAAEKARRLWCS/VFN1ERRNL/CEYPTK 
LSFNIKGEMTFSDKTEFTTN1U>SLKMXKI)RI 
nPFHTf MF*KFKrFKRKE 


484 


1834 


A 


3798 


1 


727 


FFFFETESRSVAQAGVQWCNLGSLQALPPGB 
SHSPASASRVAGTTGTRH*ARLIFYTFSRDGVS 
pr*pnws*<;pr)i virpp\ri pkcwdyRREPPRP 
A*FFVFLVE\QGFTML ARMVSIS* PQ/CDLPAS 
VSQNAGITGVSHCAWPCLHFCFFGFFFEMESC 
WAOARVOWHDLRSLOAPPPGFTPFSCLSLPG 
SWDYRRPPPRPANF\CIFSRDGVSPC*PGWSRS 
PDLV1RPPRPPKVLGLQA 


485 


1835 


A 


3802 


1 


239 


FFFFEMECLTVSQAGVQWYNLHSLQPLPPGF 

KQFSC\LSLPSSWD*RVPTSRPAKF/CVIF*DGV 

SHCQPGWSAVVQPPLH 


486 


1836 


A 


3811 


378 


98 


RYD* SSQSENIP\QKEFLLKYP* CTATLGMRN 

MSIMKKKS1FSAEFYKVSLPSLLLVHLLAIEWG 

FHIEIQLTIHQHFLNYELESDFVHIVEYM 


487 


1837 


A 


3814 


771 


320 


FDPDWTRAAG1RHEKJCPKALAYRRENSPGDL 

PPPPLPPPEEEASWAL/GAEGSRQHVLPGAGA 

QWGEESGPGRAPGSPAGAPPR*RGLAPVNSRP 

SFLSRGQGTSTCSTAGSNSSRGSSSSRGSRGPG 

RSRSRSQSRSQSQRPGQKRREEPR 
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sequence 
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488 


1838 


A 


3818 


1 


781 


FRACLLELIPYAPTLSWTACPPAMAGPRGLLP 

LCLLAFCLAGFSFVRGQVLFKGCDVKTTFVT 

HVPCTSCAAIKKQTCPSGWLRJELPDQITQDCR 

YEVQLGGSMVSMSGCRRKCRKQVVQKACCP 

GYWGSRCHECPGGAETPCNGHGTCLDGMDR 

NGTCVCQENFRGSACQECQDPNRFGPDCQSV 

CSCVHGVCNHGPRGDGSCLCFAGYTGPHCD 

QELPVWQELGFPQNNPRLRKAPNCKCLPG*H 

RNGLIATPNPCRP 


489 


1839 


A 


3822 


934 


669 


FFFSEMESRSVTRLECSGAISAHLRLLGSSNSP 
ASAS* V AGTIG ACHHAQLIFVFL VETGFHHVG 
QDGLDLUNLMIHPPRPPKVLGFQA 


490 


1840 


A 


3825 


79 


9748 


GCQSCWPAWPRLRRRGPASAGARLGRKAPW 

GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 

ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQPPPP 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPA VAEEPLHRPKKELSATKJCDRVNHCLTI C 

ENIVAQSVRNSPEFQKLLG1AMELFLLCSDDA 

ESDVRMVADECLNKV1KALMDSNLPRJLQLEL 

YKEIKKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KIMASFGNFANDNEIKVLLKAFIANLKSSSPTI 

RRTAAGSAVSICQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLLILG VLLTLRYL VPLLQQQ V 

KDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 

TLHHTQHQDHNWTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVELI 

AGGGSSCSPVLSRKQKGKVLLGEEEALEDDS 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDILSHSSSQVSAVPSDPAMDLNDG 

TQASSPISDSSQl"! 1EGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEA SEA 

FRNSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTDDDS 

APLVHCVRLLSASFLLTGGKNVLVPDRBVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 

TTEYPEEQYVSD1LNYIDHGDPQVRGATAILC 

GTLICSILSRSRFHVGDWMGTIRTLTGNTFSL 

ADCIPLLRKTLKDES S VTCKL ACTAVRNC VM 

SLCSSSYSELGLQL11DVLTLRNSSYWLVRTEL 

LETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 

LKLQERVLNNVVIHLLGDEDPRVRHVAAASL 

IRL VPKLFYKCDQGQ ADPWA V ARDQS S V YL 

KLLMHETQPPSHFSVSTITRIYRGYNLLPSITD 

VTMENNLSRV1AAVSHELITSTTRALTFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTVGMATMILTLL SSA WFPLDLSAHQDAL 

ELAGNLLAASAPKSLRSSWASEEEANPAATK 

QEEVWPALGDRALVPMVEQLFSHLLKVIN1C 

AHVLDDVAPGPAIKAALPSLTNPPSLSPIRRK 

GKEKEPGEQASVPLSPKJCGSEASAASRQSDTS 

GPVTTSKSSSLGSFYHLPSYLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQ1LEL 

ATLQDIGKCVEEILGYLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

QRLG SS S VRPGL YH YCFMAP YTHFTQ AL ADA 

SLRNMVQAEQENDTSGWFDVLQKVSTQLKT 

NLTSVTKNRADKNAIHNHIRLFEPLVDCALKQ 
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YTTTTCVQLQKQVLDLLAQLVQLRVNYCLL 

DSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFF 

FLVLLSYERYHSKQIIGIPKIIQLCDGIMASGR 

KAVTHAIPALQPIVHDLFVLRGTNKADAGKE 

LETQKEVWSMLLRLIQYHQVLEMFILVLQQ 

CHK^NEDKWKRLSRQIADIBLPMLAKQQMHI 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLWISGILAILRVLISQSTED 

IVLSRIQELSFSPYLISCTVINRLRDGDSTSTLE 

EHSEGKQKNLPEETFSRFLLQLVGILLEDIVT 

KQLKVEMSEQQHTFYCQELGTLLMCLIHIFKS 

GMFRRITAAATRLFRSDGCGGSFYTLDSLNLR 

ARSMITTHPALVLLWCQILLLVNHTDYRWW 

AEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLA 

AKLGMCNREIVRRGALILFCDYVCQNLHDSE 

HLTWLIVNHIQDLISLSHEPPVQDFISAVHKNS 

AASGLFIQAIQSRCENLSTPTMLKKTLQCLEGI 

HLSQSGAVLTLY VDRLLCTPFRVLARMVDI L 

ACRRVEMLLAANLQSSMAQLPMEELNRIQEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 

SQCWTRSDSALLEGAELVNRJPAEDMNAFM 

MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LVWSKLPSHLHLPPEKEKDIVKFVVATLEAL 

SWHLIHEQIPLSLDLQAGLDCCCLALQLPGL 

WSVVSSTEFVTHACSLIYCVHFILEAVAVQPG 

EQLLSPERRTNTPKAISEEEEE VDPN TQNPK YI 

TAACEMVAEMVESLQSVLALGHKRNSGVPA 

FLTPLLRNniSLARLPLVNSYTRVPPLVWKLG 

WSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYR 

INTLGWTSRTQFEETWATLLGVLVTQPLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSIIRGIVEQEIQAMVSKREN1ATHHLYQAWD 

PVPSLSPATTGALISHEKLLLQINPERELGSMS 

YKLGQVSIHSVWLGNSITPLREEEWDEEEEEE 

ADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFL 

LELYSRWILPSSSARRTPAILISEVVRSLLWS 

DLFTERNQFELMYVTLTELRRVHPSEDEILAQ 

YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 

RSSHLPSRVGALHGVLYVLECDLLDDTAKQL 

IPVISDYLLSNLKGIAHCVNIHSQQHVLVMCA 

TAFYLIENYPLDVGPEFSASIIQMCGVMLSGS 

EE Sir SHY HC AL KOLbKLLL dbQLoKLD Aco L 

VKLSVDRVNVHSPHRAMAALGLMLTCMYT 

vjKJbK.Vbr\jfKl ol^rNrAArJJotioVl v>yivijqkvo 

VLFDRIRKGFPCEARVVARJLPQFLDDFFPPQ 

DIMNKVIGEFLSNQQPYPQFMATVVYKVFQT 

LHSTGQSSMVRDWVMLSLSNFTQRAPVAMA 

TWSI ^CFFVSASTSPWVAAILPHVISRMGKLE 

QVDVNLFCLVATDFYRHQIEEELDRRAFQSV 

LEWAAPGSPYHRLLTCLRNVHKVTTC 


491 


1841 


A 


3826 


469 


302 


SNPPASASRVAGITG V HQHA WLIFVFL VEMEF 
HHVGQAVLKLLISGDLPVSASQSA 


492 


1842 


A 


3836 


392 


88 


VAPSPMIMPDLYFYRDPEEIEKEE+AAAEKVEE 
FQSEWTAW/P/EFTATQSEVADWFKDMQVP 
SVPIQQFPTEDWST*PTMNDWSATSTAQTTE 
WVRITTEWP 
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493 


1843 


A 


3838 


19 


380 


TPSDMNRAFETDTQSIGEKNRSPSEPDYFERK 
KFKRS*EKAHJRYKIDQPEDIPLK\EFLCKHSK 
CTATLSMRNMSLMKKKCSFSEEF\LAFFPSLL 
VCHLLAIKLGFYEEIHLTTFNNTF 


494 


1844 


A 


3845 


2 


352 


FFFLRRSL/DSVAQAEAQWIAELGLLQAPPPGF 
KPISLPVGLPSSWDYGRPPPCPANFCIF/M*RRG 
FTVLARMVLIS*PCDPPTLASQGTAITGMSYH 
ARPQDIDFLYAHQGRCWFRLL 


495 


1845 


A 


3847 


1774 


40 


DIFFRRAKEGMGQDEAQFSVEMPLTGKAYL 

WADKYRPRKFRFFNRVHTGFEWNKYNQTHY 

DFDNPPPKIVQGYKFNIFYPDLIDKRSTPEYFL 

EACAJDNKDFAILRFHAGPPYEDIAFKIVNREW 

EYSHRHGFRCQFANGIFQLWFHFKRYRYRR* 

RPWGTAGRCPRGHSKGASVKLWTPGPLSGL 

QGRGFTSHLRPHL SFARPQFPPPKGGHH* AC 

HGELRRHWDRLA* GPDATEG ALGASFEHEG 

GQQPPADLTVQADTLHRPSARLGGAHRACPK 

RRPHRVLWRWARGAWAWRCOAREKOETOO 

QPCHITGHPLGREAEPAAAGAAPALAHRPPF 

ARTG STEVPGPC WRPIRHCRRDPL WTPTL C\RD 

WPPTHPVLAGGVHFPAAG/IGGCVEVPVSVN 

VMGTKSH* AVLPPPPSTGPGGQGLPEG WGLE 

KGEGLPPGIPPPGLLTGPW\SMRPVTPSFAHIR 

TVAPSHSPFSGQEGRGPHGCHSPGR\SGPVAGR 

LVLQHPTGTSPTEAKRKVPPGPPEGHPTSPVT 

SPRPPTAPPRHPASSGNSSVCFSKKTCRWEKK 

SFVLMELAYWQDRMFF 


496 


1846 


A 


3849 


830 


442 


AKSPLPLG*IQWR/NLGSLKLRLPGFK*FTCLG 
LLSSWDYRSLPPRPVNFCILVELGFHHVDQAG 
LKLLTSSALPALASQSAEITGMSHRIWPLPLLR 
RPPVIRJRAPPQRLPFNLITSLKALSPNMATF 


497 


1847 


A 


3859 


2 


393 


ALRKTRRDGIARTGAQPAASWKGTNNYPWR 

LEMAGRPGSQEQSKDRGTGSLPPPSQRPLGPS 

PEGAGPSPPPPGIPRGGGSSSSEGP/PQLLFVPR 

RFPAPKKGLPSDTPHSKAPPTPHLILGGEDSQ 
VPIL 


498 


1848 


A 


3860 


253 


634 


KNASTVYSSQGDPKSFFFLLRWSLALVAQAG 
EQ*RDLSSLQPPPPGFK*FSCLSLPSSWD\YRCP 
LPCLANF\*FLVETGFHHVGQADLKLLTSGDP 
PTSASESAG1TGVSHRAWPRIF1FLYWKTFFL 


499 


1849 


A 


3863 


423 


263 


• APSQI S VAFL Y AA/DKLFEKEI* KKIPFII AS/DKI 
KIGIWLTKEVKYLYTENYITLMKEIK7DTDKW 
KX>ILY*WIGKINI*KMSTPPKAIYRFNA1PTKIP 
MTFFTEIEKSIIKFI WNHKKPPNTQSNIEQKE* S 

FCSILLWWGGFLWFHMNFMIDFSISVKNVIGI 
LVGIALNL 


500 


1850 


A 


3865 


2 


15246 


LPRGCLWCLQRSPTPARPQPSRPARSPLPLFP ' 

DLRPWASDLDIMGDAEGEDEVQFLRTDDEV 

VLQCSATVLKEQLKXCLAAEGFGNRLCFLEP 

TSNAQNVPPDLA1CCFVLEQSLSVRALQEML 

ANTVEAGVESSQGGGHRTLLYGHAILLRHAH 

SRMYLSCLTTSRSMTDKLAFDVGLQEDATGE 

ACWWTMHPASKQRSEGEKVRVGDDIILVSVS 

SERYLHLSTASGELQVDASFMQTLWNMNPIC 

SRCEEGFVTGGHVLRLFHGHMDECLTISPADS 

DDQRRL VYYEGGAVCTHARSLWRLEPLRIS 

WSGSHLRWGQPLRVRHVTTGQYLALTEDQG 

LWVDA SKAHTKATSFCFRISKEKLD VAPKR 

DVEGMGPPEIKYGESLCFVQHVASGLWLTYA 
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APDPKAIJILGVLKKKAMLHQEGHMDDALSL 

TRCQQEESQAARM1HSTNGLYNQFIKSLDSFS 

GKPRGSGPPAGTALPIEGVILSLQDLIIYFEPPS 

EDLQHEEKQSKLRSLRNRQSLFQEEGMLSMV 

LNCIDRLNVYTTAAHFAEFAGEEAAESWKEI 

VNLLYELLASL1RGNRSNCALFSTNLDWLVS 

KLDRLEASSGILEVLYCVLIESPEVLNIIQENHI 

KS1ISLLDKHGRNHKVLDVLCSLCVCNGVAV 

RSNQDLITENLLPGRELLLQTNLINYVTSIRPN 

IFVGRAEGTTQYSKWYFEVMVDEVTPFLTAQ 

ATHLRVGWALTEGYTPYPGAGEGWGGNGV 

GDDLYSYGFDGLHLWTGHVARPVTSPGQHL 

LAPEDV1SCCLDLSVPS1SFRINGCPVQGVFESF 

NLDGLFFPVVSF SAG VKVRFLLGGRHGEFKF 

LPPPGYAPCHEAVLPRERLHLEPIKEYRREGP 

RGPHLVGPSRCLSHTDFVPCPVDTVQ1VLPPH 

LERIREKLAENIHELWALTRIEQGWTYGPVRD 

DNKRLHPCLVDFHSLPEPERNYNLQMSGETL 

KTLLALGCHVGMADEKAEDNLKKTKLPKTY 

MMSNGYKPAPLDLSHVRLTPAQTTLVDRLAE 

NGHNVWARDRVGQGWSYSAVQD1PARRNPR 

LVPYRLLDEATKRSNRDSLCQAVRTLLGYGY 

NIEPPDQEPSQVENQSRCDRVRJFRAEKSYTV 

QSGRWYFEFEAVTTGEMRV G WARPELRPD V 

ELGADELAYVFNGHRGQRWHLGSEPFGRPW 

QPGDVVGCM1DLTENTIIFTLNGEVLMSDSGS 

ETAFREIEIGDGFLPVCSLGPGQVGHLNLGQD 

VSSLRFFAICGLQEGFEPFAINMQRPVTTWFS 

KGLPQFEPVPLEHPHYEVSRVDGTVDTPPCLR 

LTHRTWGSQNSLVEMLFLRLSLPVQFHQHFR 

CTAGATPLAPPGLQPPAEDEARAAEPDPDYE 

NLRRSAGGWSEAENGKEGTAKEGAPGGTPQ 

AGGEAQPARAENEKDATTEKNKKRGFLFKA 

KKVAMMTQPPATPTLPRLPHDWPADNRJDD 

PEIILNTTTYYYSVRVFAGQEPSCVWAGWVT 

PDYHQHDMSFDLSKVRVVTVTMGDEQGNV 

HSSLKCSNCYMVWGGDFVSPGQQGRJSHTDL 

VIGCLVDLATGLMTFTANGKJESNTFFQVEPN 

TKLFPAVFVLPTHQNVIQFELGKQKNIMPLSA 

AMFQSERKNPAPQCPPRLEMQMLMPVSWSR 

MPNHFLQVETRRAGERLGWAVQCQEPLTMM 

ALH1PEENRCMDILELSERLDLQRFHSHTLRL 

YRAVCALGNNRVAHALCSHVDQAQLLHALE 

DAHLPGPLRAGYYDLLISIHLESACRSRRSML 

SEY1VPLTPETRA1TLFPPGRSTENGHPRHGLP 

GVGVTTSLRPPHHFSPPCFVAALPAAGAAEAP 

ARLSPAIPLEALRDKALRMLGEAVRDGGQHA 

RDPVGASVEFQFVPVLKLVSTLLVMGIFGDE 

DVKQILKM1EPEVFTEEEEEEDEEEEGEEEDEE 

EKEEDEEETAQEKEDEEKEEEEAAEGEKEEG 

LEEGLLQMKJLPESVKLQMCHLLEYFCDQELQ 

HRVESLAAFAERYVDKLQANQRSRYGLLIKA 

FSMTAAETARRTREFRSPPQEQINMLLQFKDG 

TDEEDCPLPEEIRQDLLDFHQDLLAHCGIQLD 

GEEEEPEEETTLGSRLMSLLEKVRLVKKKEEK 

PEEERSAEESKPRSLQELVSHMVVRWAQEDF 

VQSPELVRAMFSLLHRQYDGLGELLRALPRA 

YTISPSSVEDTMSLLECLGQIRSLLIVQMGPQE 

ENLM1QSIGNININNKVFYQHPNLMRALGMIIE 

TVMEVMVNVLGGGESKEIRFPKMVTSCCRFL 
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CYFCRISRQNQRSMFDHLSYLLENSGIGLGM 

QGSTPLDVAAASVIDNNELALALQEQDLEKV 

VSYLAGCGLQSCPMLVAKGYPDIGWKPCGG 

ERYLDFLRFAVFVNGESVEENANWVRLLIR 

KPECFGPALRGEGGSGLLAAIEEAIRJSEDPAR 

DGPGIRRDRRREHFGEEPPEENRVHLGHAIMS 

FYAALIDLLGRCAPEMHLIQAGKGEALRIRAI 

LRSLVPLEDLVGHSLPLQIPTLGKDGALVQPK 

MS ASFVPDHKASM VLFLDRV Y GIENQDFLLH 

VLDVGFLPDMRAAASLDTATFSTTEMALAV 

NRYLCLAVLPLITKCAPLFAGTEHRA1MVDS 

MLHTVYRLSRGRSLTKAQRDVIEDCLMSLCR 

YIRPSMLQHLLRRLVFDVP1LNEFAKMPLKLL 

TNHYERCUTCYYCLPTGWANFGVTSEEELHL 

TRKLFWGIFDSLAHKKYDPELYRMAMPCLC 

AIAGALPPDYVDAS Y SSKAEKKATVDAEGNF 

DPRPVETLNVIIPEKLDSFINKFAEYTHEICWAF 

DKIQNNWSYGENIDEELKTHPMLRPYKTFSE 

KDKEIYRWPIKESLKAM1AWEWTIEKAREGE 

EEKTEKKKTAKISQSAQTYDPREGYNPQPPDL 

SAVTLSRELQAMAEQLAENYHNTWGRKKKQ 

ELEAKGGGTHPLLVPYDTLTAKEKARDREKA 

QELLKFLQMNGYAVTRGLKDMELDSSSIEKR 

FAFGFLQQLLRWMDISQEFIAHLEAWSSGRV 

EKSPHEQEIKFFAKILLPLINQYFTNHCLYFLS 

TPAKVLGSGGHASNKEKEMITSLFCKXAALV 

RHRVSLFGTDAPAVVNCLHILARSLDARTVM 

KSGPEIVKAGLRSFFE SASEDIEKMVENLRLG 

KVSQARTQVKGVGQNLTYTTVALLPVLTTLF 

QHIAQHQFGDDVILDDVQVSCYRTLCSIYSLG 

TTKNTYVEKLRPALGECLARLAAAMPVAFLE 

PQLNEYNACSVYTTKSPRERAILGLPNSVEEM 

CPDIPVLERLMADIGGLAESGARYTEMPHVIE 

ITLPMLCSYLPRWWERGPEAPPSALPAGAPPP 

CTAVTSDHLNSLLGNILRTIVNNLGIDEASWM 

KRLAVFAQPIVSRARPELLQSHFIPTIGRLRKR 

AGKWSEEEQLALEAKAEAQEGELLVRDEFS 

VLCRDLYALYPLLIRYVDNNRAQWLTEPNPS 

AEELFRMVGEIFIYWSKSHNFKREEQNFWQ 

NEINNMSFLTADNKSKMAKAGDIQSGGSDQE 

RTKKKRRGDRYSVQTSLIVATLBCKMLPIGLN 

MCAPTDQDLnXAKTRYALKDTDEEVREFLH 

NNLHLQGKVEGSPSLRWQMALYRGVPGREE 

DADDPEKTVRRVQEV S AVLYYLDQTEHPYKS 

KKAVWHKLLSKQRRRAWACFRMTPLYNLP 

THRACNMFLESY KAA WILTEDHSFEDRMIDD 

LSKAGEQEEEEEEVEEKKPDPLHQLVLHFSRT 

ALTEKSKLDEDYLYMAYADIMAKSCHLEEG 

GENGEAEEEVEVSFEEKQMEKQRLLYQQARL 

HTRGAAEMVLQMISACKGETGAMVSSTLKL 

GISILNGGNAEVQQKMLDYLKDKKEVGFFQS 

IQALMQTCSVLDLNAFERQNKAEGLGMVNE 

DGTVINRQN GEKVMADDEFTQDLFRFLQLLC 

EGHNNDFQNYLRTQTGNri'l INinCTVDYLL 

RLQESISDFYWYYSGKDVIEEQGKRNFSKAM 

SVAKQVFNSLTEYIQGPCTGNQQSLAHSRLW 

DAVVGFLHVFAHMMMKLAQDSSQIELLKEL 

LDLQKDMVYMLLSLLEGNVVNGMIARQMV 

DMLVESSSNVEMILKFFDMFLKLKDIVGSEAF 

QDYVTDPRGLISKKDFQKAMDSQKQFSGPEI 
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• 


• 


QFLLSCSEADENEMINCEEFANRFQEPARD1G 

FNVAVLLTNLSEHVPHDPRLHNFLELAESILE 

YFRPYLGRJEIMGASRRIERIYFEISETNRAQW 

EMPQVKESKRQFIFDVVNEGGEAEKMELFVS 

FCEDTIFEMQIAAQISEPEGEPETDEDEGAGA 

AEAGAEGAEEGAAGLEGTAATAAAGATARV 

V AAAG RALRGLS YRSLRRRVRRLRRLTAREA 

ATAVAALLWAAVTRAGAAGAGAAAGALGL 

LWGSLFGGGLVEGAKKVTVTELLAGMPDPT 

SDEVHGEQPAGPGGDADGEGASEGAGDAAE 

GAGDEEEAVHEAGPGGADGAVAVTDGGPFR 

PEGAGGLGDMGDTTPAEPPTPEGSPILKRKLG 

VDGVEEELPPEPEPEPEPELEPEKADAENGEK 

EEVPEPTPEPPKKQAPPSPPPKKEEAGGEFWG 

ELEVQRVKFLNYLSRNFYTLRFLALFLAFAIN 

FILLFYKVSDSPPGEDDMEGSAAGDVSGAGS 

GGSSGWGLGAGEEAEGDEDENMVYYFLEES 

TGYMEPALRCLSLLHTLVAFLCIIGYNCLKVP 

LViFKREKELARKLEFDGLYITEQPEDDDVKG 

QWDRLVLNTPSFPSNYWDKFVKRKVLDKHG 

DIYGRERIAELLGMDLATLEITAHNERKPNPP 

PGLLTWOnSIDVKYQIWKFGVlr I UNbrL YLu 

WYMVMSLLGHYNNFFFAAHLLDIAMGVKTL 

RTILSSVTHNGKQLVMTVGLLAVVVYLYTVV 

AFNFFRKr YNKbhDbD-kr DMivl^-UL/MiVi lv^YL 

FHMYVGVRAGGGIGDEIEDPAGDEYELYRW 

FDITFFFFVIVlLLAnQGLIlDAFGELRDQQEQV 

KEDMETKCFICGIGSDYFDTTPHGFETHTLEE 

HJNLAJN Y IVLr r .LIV1 i 1^1 IN ivL/c, I r,rl 1 vjv,/e,o i v wiv 

MYQERCW DFFPAGDCFRKQYEDQLS 


501 


1851 


A 


3869 


467 


665 


VIVAJYCQLIFDKGAKTIQ*PFQQIAL/CKRMK 
LGPCFTPCGKJNSEWIRELSVRVKTIKHLEIGV 

N 


502 


1852 


A 


3888 


1042 


724 


SGMQWRDLTPLQPLPPRFKQFSCL SLPGS WD 
YRHAPYPL LTNF\* FL V EMGFCY VGQ AGRKJLL 
AboDQoAL.Ai*>v^oA*jl 1 ulo I ArUrrrrrLiNrE.A 
GSCSVAQAGVQ 


503 


1853 


A 


3891 


1773 


1193 


EVDSQSGVQ*QAPGSLQLQTPGLK/VSCLLSR 

atwdoci nrji A orr^VWWVA/'FT +RR f^T TTI 
QDYKosLrrlLA5tt i Y T i Y i / vrL T luvut I il. 

VQGGLKLLPSSNPF AS AP* TAGITGMSHC AGP 

HFNF*MFRKISCIRE*F*HTRIYDIPFLILFFKET 

WVLLCYPGWPQIPGLKPSSCLRLL5SWDHRC 

APPCPASFFIFHVDRVSPPCPGLVS1TFKMLLL 

L 


504 


1854 


B 


3896 


279 


70 


M VS KSKSILMSYNH V ELTFSDMKKMPE AFRR 
TQKHTIYLIPYQV1FWSTGKDAMRSFMMPFY 

QKEYYENQ* 


505 


1855 


A 


3899 


2 


1396 


EPGVPTKKTWFDKPDFNRTNSPGFQKKVQFG 

NENTKLELRKVPPELNNISKLNEHFSRFGTLV 

NLQV AYNGDP EGALIQFATYEEAKKAI SSTEA 

VT NNRFIKVYWHREGSTOOLOTTSPKVMOPL 

VQQPILPWKQSVKERLGPVPSSTIEPAEAQS 

ASSDLPQVLST\LLA*QKQCHQLL/WKAAQKT 

LLVSTSAVDNNEAQKKKQEALKLQQDVRKR 

KQEILEKHIETQKMLISKLEKNKTMKSEDKAE 

1MKTLEVLTKNITKLKDEVKAASPGRCLPKSI 

KTKTQMQKELLDTELDLYKKMQAGEEVTEL 

RRKYTELQLEAAKRGILSSGRGRG1HSRGRGA 

VHGRGRGRGRGRGVPGHAVVDHRPRALEIS 
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AFTESDREDLLPHFAQYGEIEDCQIDDSSLHA 
VITFKTRAEAEAAAVHGARFKGQDLKLAWN 
KPVTNISAVETEEVEPDEEEQREIIIA 


506 


1856 


A 


3911 


1952 


919 


DAELSGTLSLVLTQCCKRIKDTVQKLASDHK 
DIHSSVSRVGKAIDKNFDSDISSVGIDGCWQA 

dsqrllnevmvehffrqgmldvaeelcqes 

glsvdpsqkepfvelnr1lealkvrvlrpale 

wavsnremliaqnsslefklhrlyfisllmg 

gttnqrealqyaknfqpfalnhqkdiqvlm 

gslvylrqgienspyvhlldanqwadicdift 

rdacallglsvesplsvsfsagcvalpalin1k 

avieqrqctgvwnqkdelpiev\dlg*ksagy 

hsifacpilrqqttdnnppmklvcghnsrdal 

NKMFNG SKLKCPYCPMEQSPGD AKQIFF 


507 


1857 


A 


3936 


439 


18 


SHPFSPAPGICPDAPPPLPRPSKGLGHPGTAGA 
PGSGARCHPPSTCSPSWASPG*GAKASPALPR 
SHGVTLLCKAQAHLCRGEDSKDASGSTSQA 
WEPG* G A WGMPRCQGPALGS CFCPPGTTVQ 
RPAKQRDKRNRHLGR 


508 


1858 


A 


3944 


120 


412 


WCPAGTLDFPGPQEMVLLEIEVMNQLNHRNL 
IQLYAAIETPHEJVLFME\YECPK* W* GLGGGT 
TRHGASR^GGVCAHSIEGGELFERIVDEDYHLT 
EV 


509 


1859 


A 


3949 


31 


392 


LTKTPSPREK.GRGVLSVLLMMI* KCRVIFVK1P 
MVFFLQNFC/RJILNVA\WTGD*PNTL*KEQRG 
ITFSDSKS+ YKATKIKTMW YCHKNRYID/ERN 
RIEIPEINPCICDKHFRKLSMTTQ 


510 


1860 


A 


3954 


1013 


885 


FSETK^CCPRLEHSGRIEAHCSLNIPGSSDPPT 
SASSVAATTG 


511 


1861 


A 


3956 


1 


1054 


PPAWAPRSPLIWAPTSGRHPCRAALPWSTSSV 

RWQPSEKQPPPPAHRGPADSLSTAAGAAELS 

AEG AGKSRG SGEQDWVNRPKTVRDTLL ALH 

QHGHSGPFESKFKKEPALTAVARTARKRKPS 

PEPEGEVGPPK\TTERPSRGCPHPQRGSRSP*L 

LHPLLCLRHHPLPHLIPTGPHRLKRPRMVP\SP 

MAALILVADNAGGSHASKDANQVHSTTRRN 

SNSPPSPS SMNQRRLGPREVGGQGAGNTGGL 

EPVHPASLPDSSLATSAPLCCTLCHERLEDTH 

FVQCPSVPSHKFCFPCSRQSIKQQGASGEVYC 

PSGEfCCPL VG SN VPWAFMQGEIATILAGD VK 

VKKERDS 


512 


1862 


A 


3957 


1086 


3 


QDRARLDCSSATSAHCNLRLPGS+DSPASASR 

VAGTTDTHHHTWLILGSSVQTGFDHVGQAG 

LELLTSGDPPISASESAGIMGMSHCVWP* S WG 

LSHHMAPPQGDGGRARGTPGPEQSFWNLSC 

H*PRCQVPS*LMTQL/FWGRHQYNPTMKRGK 

LRHREACSLPLPGEGEPGLQPSS\*SQNPCSSPL 

FHHGL* AWLWCPELLLQGQARRH* RSPPS/FK 

CPATLSLTAWSQTKRLRSQFLLLPWL*RAL+H 

PPVCHWPSRRSLGDPLLPRSQG *RDGT* ASTFC 

SYF*DTESHLVAQAGVQWRDLGSLQPPCPRL 

K\RFSRLSPPSSYTHRYVPSHLAESCISSRDRIP 

PSRPDRSRNSNSLSR 


513 


1863 


A 


3961 


3038 


476 


VALTTSMCCNKQVIVIDKIKSASIADRCGALH 

VGDHILSnXjTSMEYCTLAEATQFLANTTDQ 

VKLEILPHHQTRLALKGPDHVKIQRSDRQLT 

WDSWASNHSSLHTNHHYNTYHPDHCRVPAL 

TFPKAPPPNSPPALVSSSFSPTSMSAYSLSSLN 

MGTLPRSLYSTSPRGTMMRRRLKKKDFKSSL 
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SLAS ST VGL AGQVVHTETTEV VLTADP VTGF 

GIQLQGSVFATETLSSPPLISYIEADSPAERCG 

VLQIGDRVMAINGIPTEDSTFEEASQLLRDSSI 

TSKVTLEIEFDVAESVIPSSGTFHVKLPKKHN 

VELGITISSPSSRKPGDPLVISDIKKGSVAHRT 

GTLELGDKLLAIDNIRLDNCSMEDAVQILQQC 

EDLVKLKIRKDEDNSDEQESSGA1IYTVELKR 

YGGPLG\ITISGTEEP\FDL*IISSLTKGGLAERT 

GAIHIGDRILVAINSSSLKGKPLSEAIHLLQMAG 

ETVTLKIKKQTDAQSASSPKKFPISSHLSDLGD 

VEEDSSPAQKPGKLSDMYPSHGCPSVDSAVD 

SWDGSA\IDTS\YGTEGT\SFQASGY\NFNTYD 

WRSPKORGS\LSPVT\KPRSOTYPDVGLSYED 

WDRSTASGFAGAAVDSAETEQEENFWSQALE 

DLETCGQSGILRELEATIMSGSTMSLNHEAPT 

PRSPAGSDRPSFQERSSSRPHYSQTTRSNTLPS 

DVGRKSVTLRKMKQEDCEIMSPTPVELHKVT 

LYKDSDMEDFGFSVADGLLEKGVYVKNIRPA 

GPGDLGGLKPYDRLLOVNHVRTRDFDCCLV 

VPLIAESGNKLDLVISRNPLASQKSIDQQSLPG 

D*SEQNSAFFQQPSHGGNLETREPTNTL 


514 


1864 


A 


3967 


833 


800 


LEKQGVSGMATKRLARQLGLIRRXSIAPANG 
NLGRSKSKOLFDYLIVIDFESTCWNDGKHHH 
SQEIIEFPA VLLNTSTGQID SEFQA Y VQPQEHPI 
LSEFCMELTGIKQAQVDEGVPLKICLSQFCK 
WIHKIQQQKNIIFATGISEPS/DF* SKIMCICYL 
VR*RISYTY*SKHKSKGC 


515 


1865 


A 


3969 


492 


182 


CRFWGISTHCDTCDPLSPQTTEG**EGDLWSL 
DLLGPEFLARKPLFKTKTYQSTF* SISKNE/FTC 
PNFIIEEGTDLIF\*QVKHNPCHRLTPEEGTVQL 
NRADS 


516 


1866 


A 


3977 


2 


1357 


KMIXVQKESNYIRLKRAKMDKSMI^KIKTLGI 

GAFGEV CLARK VDTKALYATKTLRKKDVLL 

RNQVAHVKAERDILAEADNEWVVRLYYSFQ 

DKDNLYFVMD Y IPG GDMMSLLIRM GrFPESL 

ARFYIAELTCAVESVHKMGFIHRDIKPDNILID 

RDGHIKLTDFGLCTGFRWTHDSKYYQSGDHP 

RODSMDFSNEWGDPSSCRCGDRLKPLERRAA 

RQHQRCLAHSLVGTPNYIAPEVLLRTGYTQL 

CDWWSVGVILFEMLVGQPPFLAQTPLETQM 

KVINWQTSLHIPPQAKLSPEASDLIIKLCRGPE 

DRLGKNGADEIKAHPIF*NQFDFSQ*PEDSRS 

AFKQFP*NHTTPTDTSNFDP\VDPDKLWSDDN 

EEENVNDTLNGWYKNGKHPEHAFYEFTFRRF 

FDDNG YPYNYPKPIEYEYINSQ GSEQQSDEDD 

QNTGSEIKNRDLVYV 


517 


1867 


A 


3980 


1358 


1022 


FFFKKFTQSLGFLLFSFSFLFSCFFFFHFVLFCY 
VFLDRVPLCHPGWSAWQSQVT/VNLPPSWD 
* RCRPPH/L ANLCNFCRD\SFTTLPRL VLNT W A 
QAIFQPQPPKVLGLQV 


518 


1868 


A 


3986 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*F 
SCFSLPE*LGYRHVPPCLANSVFSVEMG\FLH 
VGQAGLELLTSGDLPALASQSAGITGVSHRAR 
PENGFENIF 


519 


1869 


A 


3994 


751 


126 


NQGLRHVGLCRTCLVNQMFASSILGKSHHHS 
LISINQGHNALWKAAG\PLPLKAGYC\QSFSPC 
DSLKYG\SWDEKDLTVPQRDTHKRSVLRWIS 
QRGKU^AVEMEEGHCLIALPL GTECLGIKVPIV 
HLFSSEMGE\NRPMVG\ARHVYSNAALLSFTP 



202 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanme, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M=Methionine, N^Asparagine, P= Pro line, 
Q=Glutamine, R=Arginine, S= Serine, 
T=Threonine, V— Valine, W— Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LRCLGGEKHKSGLHARPVIVPSLELHYDMDSI 
AHWFADLLLIITLPSYYIPFC 


520 


1870 


A 


3999 


882 


698 


QSFRLSLLSSWDYRHM*PRLANF*TvFFCRDR/ 
SLALLPRLVSNSWPQA1LPPRPPKVLGLQT 


521 


1871 


A 


4011 


1346 


1178 


FFF*ETVSCSAS*AGVRSHDNSSLQPPSPG\SSN 
PPTSASHVAGATGTHHHAWLLSV 


522 


1872 


A 


4015 


2 


377 


QGIALLTRMGESVKHVTGGYKLRTRPLEFAA 
IGDYLDTFALKLGTIDRIAQRIIKEEIEYLVELR 
EYGPVYSTWSALEGELAEPLEGVSACIGNCST 
AL* ELTDDMTEDFLFVLREYIL YSDSMK 


523 


1873 


A 


4018 


341 


19 


ERVIHNQIQQAQRSPHIFNARRSS/PRPNIVELP 
KVKEVCKTSKS/GQVIYKGVSIRLRANFLAEP 
L*NRREWDEAIKVLKEKQ\FLSKMVYPANLSF 
GNEGDITSFPAK 


524 


1874 


A * 


4020 


1067 


743 


FFLRWSL/DSVAQAGVKWCNLGSLQAPPPGF 
TPFSCLSLPSSWDYRHPPPRLAN*LTNFLCF** 
RQGFTVLARMVLIS*PHDLPASASQSAGITGL 
SHCSWPTSSILS 


525 


1875 


A 


4021 


781 


351 


QFRV1FFFLRRSHSVAQAGMQWHDHSLLQPL 
PPRLKQ/F/SHLSPPSIWDYRRVPPCLVNFSIFF 
VETGSCQPCLQLLGSSNPPASASQSAGIAGISH 
QGQPE* SFDIRFACVI AALRETFQCLCS A SR VN 
N KIIN RPTHP VES SF 


526 


1876 


A 


4024 


80 


341 


TPSSTSRGTEEQQSSKMAWQRREEKEHLNVR 
RSSAEDGWKADKP/VDG+TPGEDHLPTPSPFQ 
LHIHSSESQLHHSVKSPPSLSFRLM 


527 


1877 


A 


4026 


593 


230 


DFYLYPERKKRGQMMTAVSLTTRPQESVAFE 
DVAVYFTTKEWAIMGVPAERALYRDVMLEN 
YGGCGPL*CHPTSKPALVFS\LEQGKESCFSPA 
TGSSLSRNDWRAGWIGYLELRRYTYLS 


528 

- 


1878 


A 


4028 


1160 


242 


GTSELLCIQRWNWGPAFPPRPGLALAPTLQLL 

VEMG S AKS VPVTPARPPPHNKHL ARV ADPRS 

PSAGILRTPIQVESSPQPGLPAGEQLEGLKHAQ 

DSDPRSPTLGIARTPMKTSSGDPPSPLVKQLSE 

VFETEDSKSNLPPEPVLPPEAPLSSELDLPLGT 

QLSVEEQMPPWNQTEFPSKQVFSKEEARQPT 

ETPVASQSSDKPSRDPETPRSS\GSMRNRWKP\ 

NSSKVL\GKSPLHPSCQDDNSPGTLTLRQGKA 

AFKPLSENVSELK\EGAMLGTGR\LLKTEGRA 

WEQGQD\HDKENQHFPLVES 


529 


1879 


A 


4039 


2 


366 


KDMVLIMEMQSMITMKCPQ YL+ E*RKIPDITK 
CW*GCGSTGnJFC/WS*PL*KTI*QPR*FKQI*T 
ILTIIYSIM*EHTFHNAGV*LSDIYPRFMKGYV 
HTEICT*MF1AVLFVVVKTWKQF 


530 


1880 


A 


4057 


358 


3 


LLEVNGNTIVTVFTKAQNKKNKGSRSILFKQL 
RKYGSRINLLKSKHDKNICTENYKT*MKEIEA 
/DTDKWKDILCSWIRRJHMKDILCSWIGRTHV 
VKISILPKVNYRFYLISIKIIMAI 


531 


1881 


A 


4061 


50 


278 


TQGTEEIYKISSCEWVQASFSTPLITLHDFKIY 
HKATV1KM VW Y WHRQ* KFSKN/RIESSEIEPH 
IYDQFIFDKGEKIIQEKGNSFFNN/MCWKNWIF 
T*KR 


532 


1882 


A 


4069 


19 


368 


NDLLENFKFWE*FKE*LENINGTVTEKETGGV 
YKELSSPKYSGTRQFYGQnSNFPGKIISMVY 
KXFQNTE/TEGRHPISLYEFR1TLITIPNKDNIYL 
QIWMPVSLMNIVTLKCPT 


533 


1883 


A 


4076 


1 


355 


PIRKFTKVAG*KSNTPK*LAFLHINNEQFENKJ/ 
ITNI/PFIIASKRIKYSGISLTKEMKDLYTETLLR 
KIKEDTNKWKDI/SCFWVGR/LNIVKMPK/VIC 
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IFNAIPIKMPMMCMAKIEKNSS 


534 


1884 


A 


4088 

• 


3 


1931 


IIDSSTRRMESERSPLYRQUDLGYLSSSHWNC 

GAPGQDTKAQSMLVEQSEKLRHLSTFSHQVL 

QTRJLVDAAKALNLVHCHCLDIFINQAFDMQR 

DLQITPKRLEYTRKKENELYESLMNIANRKQE 

EMKDMIVETLNTMKEELLDDATNMEFKDVI 

VPENGEPVGTREIKCCIRQIQELIISRLNQAVA 

NKLISSVDYLRESFVGTLERCLQSLEKSQDVS 

VHITSNYLKQILNAAYHVEVTFHSGSSVTRM 

LWEQIKQHQRTTWVSPPAiTLEWKRKVAQEAI 

ESLSASKLAKSICSQFRTRLNSSHEAFAASLRQ 

LEAGHSGRLEKTEDLWLRVRKDHAPRLARLS 

LESRSLQDVLLHRKPKLGQELGRGQYGWYL 

CDNWGGHFPCALKSVVPPDEKHWNDLALEF 

HYMRSLPKHERLVDLHGSVIDYNYGGGSSIA 

VLLIMERLHRDLYTGLKAGLTLETRLQJALDV 

VEGIRFLHSQGLVHRDIKLKNVLLDKQNRAKI 

TDLGFCKPEAMMSGSIVGTPIHMAPELFTGK 

YDN S VD VYAFGILFWYICSG S VKLPEAFERCA 

SKDHLWNNVRRGARPERLPVFDEECWQLME 

AC WDGDPLKRPLLGIVQPMLQG IMNRJLC K S\ 

NSEQPNRGLDDST 


535 


1885 


A 


4090 


2 


417 


ALMPHEANYEEIFLKTDKDMDGFESGLEVRE 

IFLKTR/GLPSTLLAHIWALCDSKDCGKLSKD 

HFALAFHLITVQKLIKGIDPPLVLTPEKISPSNR 

ASLQKVTELTRKPVCUFKGTILWRITDSIWMK 

HNRKRIWLRA 


536 


1886 


A 


4102 


569 


829 


DHQK*KNIPCSWIGRINIVKMSILPKAIYRFSAI 
PIKIPMTFFTEI*S*NVYRTTKTQE*AKAILSKK 
b^NbbbSH YLDFK* Y YRA V 


537 


1887 


A 


4104 


54 


281 


SIDCEHLIRRMLVLDPSKRLTIAQIKEHKWML 
IEVPVQRPVLYPQEQENEPSIGEFNEQVLRLM 
HSLGIDQQKTIE 


538 


1888 


A 


4109 


141 


314 


IRFnPLKIRSVVSHLKCFYKFILTFFFAGCSQPL 
VPRENITAWMNAIGLIITALPVS 


539 


1889 


A 


4111 


268 


1 


ASRPWGHSYP*FNQQEVDTLKRPIASSEI*MM 
I * KPATVKKSPGPYRFTAEFSHTFKEDL VPIL W 
PLFPKJYREGTLPHSFYEASITL 


540 


1890 


A 


4142 


198 


2064 


PEPGAGRAATPWGPLFWRGRGSGRCEKAAE 

AALGDFLGLHRRTQQPAVDRLLSDASAQWR 

VRGHGGVRESGRAPQQPGRRRGRRPRKRPR 

GRWRREGCGAGGRGVCVAAWSQRSIAGNN 

DYRLFHKMSNSHPLRPFTAVGEIDHVHILSEH 

JGALL1GEEYGDVTFVVEKKRFPAHRVILAAR 

CQYFRALLYGGMRESQPEAEIPLQDTTAEAFT 

MLLKYIYTGRATLTDEKEEVLLDFLSLAHKY 

GFPELEDSTSEYLCTILNIQNVCN4TFDVASLY 

SLPKLTCMCCMFMDRNAQEVLSSEGFLSLSK . 

TALLN1 VLRDSFAAPEKDIFLALLN WCKHN SK 

ENHAEIMQAVRLPLMSLTELLNVVRPSGLLSP 

D A1LD AI K. VKM5 SRJDMDLN YRGMUPEENIAT 

MKYGAQWKGELKSALLDGDTQNYDLDHG 

FSRHPIDDDCRSGIEIKLGQPSIINHVRILLWDR 

DSRSYSYFIEVSMDELDWVRVIDHSQYLCRS 

WQKLYI^ARVCRYIRJVGTHNTVNKIFHIVAF 

ECMFTNKTrTLEKGLlVPMENVATIADCASVI 

EGVSRSRNALLNGDTKNYDWDSGYTCHQLG 

SGAIVVQLAQPYMIGSIRVLLWDCDDRSY 


541 


1891 


A 


4146 


282 


778 


GTLGYPNGARGQPQDNFFAHQWSHHPPISAC 
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HAESENFAFWQDMKWKNKFWGKSLEIVPVG 
TVNVSLPRFGDHFEWNKVTSCIHNVLSGQRW 
EEHYGEVLIRNTQDSSCHOOTFCKAKYWSSN 
VHEVQGAVLSRSGRVLHRLFGKWHEGLYRG 
PTPGGQCIWKP 


542 


1892 


A 


4147 


44 


433 


S VDAYVCNDIVFS YRTTITLLEGA* LTHRYVA 
QDPKQGQLRSLHLTCDSAPAGSQGTWSTSCR 
INHLIFRGGAQITFLATFDDSPKAVLGDRLLLT 
ANVSSENNTPRTSKTTFQLEL SVKDA V YTW 
SSH 


543 


1893 


A 


4153 


678 


11 


TISYPQCLTQMYFLISFANVDTFLLPIMALDH 

YVAICSALQ*CSIITP/ELCQGLPVLA*AGSSUS 

PVHTVIMSRLAFCSSAQISHFYRDAYLLMKIA 

CSHT*\NQHVFLGAWLFLAPCALILVSYIRIA 

AAJLRIPSPTRRRKACSICSSHLSLVTLFYGTV 

LGICI*PPDSFSAQDAIATIMYTVVTSMLNPFIY 

SLMNKEVQEAVRRLFSRGSHSSWCW 


544 


1894 


A 


4158 


3 

• 


538 


LLYAQAG VQ* LNL S SLQPQPAGLKQSSHPSLP 

SSWDYRYSTPHPANFFVEMEFHHVAQAGLEL 

LGSGDLPTSTSH SAGITG V\SHHAPPRLI S SEGS 

LLGHLLCLPMVFPLLCVFVL1SSSLAGEEAAG 

LRVQICLWPAWLSHLPVCWFHCSGIWSEVIE 

LKVGREGHVLPWQAHWEF 


545 


1895 


A 


4160 


1 


412 


HPLGLGLVPSEIFSPQDKKAADGSILAPARGE 

DLEAGLKGSFMDGRLQASVSVFRIQRVGSAM 

QDTASAMPCLPYYPTSHCFMAGGKSRSQGW 

ELELSGEPAPGWQVLAGYTYTQARYLRDASE 

ANVGQPLRPVDPR 


546 


1896 


A 


4174 


1252 


1190 


FFQVFIFLFLIFFKTEFHSCCPGAVQWHDLDSL 
QPPPPRFKGFSCLSLPSSWDYRHAPAHPANFV 
FLVETGFLHV\GQ\ASLELPTSGDTPAS\ASQSA 
GITGVSHHA*PRASGRRCW 


547 


1897 


A 


4176 


3029 


1 


A GPD GLA APA SC QG ARGQTR VPG AF S W LAP 

GSHHASEGLAPGVPPAGGVSAQELTAPPQEG 

WGLGAPPAAPRPESDEKRAGSDAVRSFSRGA 

RDSLGQRRLGGTRGAGPAGKGAQRTMGPAS 

GFHSFPPRPHQEPSPRSSCWQHLLWHCPWPQ 

PSRLPRLTPAQLLQGPGVLAAPPGP* H VPGFL 

AQSPWPLPSGPRSP*DPLHQGALVPLPQGGSP 

HTAPHCLPSVLSPAIQQPLLPTAST/SSRSPPAS 

TMAPIPSALAVWEPAGSSPQLSSAPADSSVPLP 

ALPKVLPPWTQKPLLGCLCQSPLPLLSPPDQI/ 

RCPPACSPAAASSFSFESQPCPSAPSKASPAPA 

ALUVGPHHPP*SQQPQSQSVHPHGPGGPQPPL 

AASSLFWMFCQPPPPHPQFLWHRPLPVTGKA 

LAS\PLCFRPAPGSLRQTPLPPQFHIPRPGLSAP/ 

PPPASGTSDSSDSRSPSASAARVWPPA\SPPPP 

AARHRPHPPEYFLSPCPFSCGFPRLLGRPRRPQ 

ALQTTRAWDLPPGSSPAPLCSGPELP*APPPLP 

PFPRVA*LGSGHPPSAQVPGLW*RCV*GHPIP 

RPVGHS*SGPPHSPPL*APPQAWPLELPPSRQC 

LQPLHLRAAQPLDPCCSLSPPGPPLPVPALPS 

WPGRP*SPSPASSQPPYHAGLPGPQSSPLPPGL 

PQLPSLRSGSQQPLLFFQCPGPGAVWGKGSPQ 

PLSPHPPPP/ARTQTFPVASRSLSPGTAPYSVCL 

TPSRSASSLPEWLASSLPKJPQSSGSXPLGPTSP 

MP*CFHRPSPPLP/LSSPFPA\LRPQAPQFPLHLP 

P*PPAPSPGCPLPPLAQQHQPSPPSPHARSTLT 

PPLWPSLALLP*PLPPPPPVPSFSASLLCSLPAH 
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GTPASPGLGRSCLGKPQTLPWISFWPPSGRLA 

PGTWQPW/PVSPAPLSCLSAWDPWELPSPQPQ 

VCSTAELPTSCLLSSPGP\PAFQPPRFGCL*GPP 

GPPGLPPLQSSLSFPPPPPPVPQPPAPPALQWG 

LHLPGGRTK 


548 


1898 


A 


4180 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTK 

KIQFHQELLVLFWKLCDFNKVGQPRGALQGD 

GEQLPQ*PGGRDSVRLRGVGQSCPSLELSPLG 

PSPHP*KFLFFVLKSSDVLDILVPILFFLNDAR 

ADQSRVGLMHIGVF1LLLLSGECNFGVRLNKP 

YSIRVPMDIPVFTGTHADLLIVWFHKIITSGHQ 

RLQPLFDCLLTIVVNVSPYLKSLSMVTANKLL 

ML-LiiArM 1 Wrlvr^AAQiNHnLVrrLfLbVrTNNl 

IQYQFDGNSNLVYAIIRKRSIFHQLANLPTDPP 

TIHKALQRRRRTPEPLSRTGSQGGAPPWRAPA 

PLPLQSQAPSRPVWWLLQALTS*PRSPRCQR 

MAPCGPWNLSPSRAWRMAARLRGSPARHGG 

SSGDRP/HSSASGQWSPTPEWVLSWKSKLPLQ 

I IMKLLy VL Vr\> VbivlL-lDKOL 1 DtSEILRFLQ 

HGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


549 


1899 


A 


4191 


858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKT 

ALLQDGRRK VHYLFPDGKEMAEE ydektse 

LLVRKWRVKSALGAMGQWQLEVGDPAPLG 

AGNLGPELIKESNANPIFMRKDTKMSFQWRIR 

NIJ^YPKDVYSVSVDQKERCIIVRTTNKKYYK 

KFSIPDLDRHQLPLDDALLSFA\TPTAP 


550 


1900 


A 


4192 


1 


1980 


IRHTGSDIAGVCGWLLLSGPCGVGLDLDSRLL 

GASAMRRSEVLAEESIVCLQKALNHLRE1WE 

LIGIPEDQRLQRTEWKKHIKELLDMMIAEEE 

SLKERLIKSISVCQKELNTLCSELHVEPFQEEG 

ETTILQLEKX>LRTQVELMRKQKKERKQE\LKL 

LQEQDQELCVEILCMPHYDIDSASVPSLEELNQ 

FRQHVTTLRETKASRREEF/VSSIKRQIILCME 

ELDHTPDTSFERDVVCEDEDAFCLSLENIAT\L 

QKLLRQ\LEMQKSQNEAVCEGVLRTQI\RELW 

DRLQIPEEEREAVATIMSGSKAKVRK\ALQ\LE 

VDRLEELEKCKTMKKVIEAIRVELVQYWDQC 

FYSQEQRQAFAPFCAEDYTESLLQLHDAEIVR 

i itv r\hA^L,r cAj V l/K. W Jb-ti 1 W KJL-r LbrbK 
KASDPNRFTNRGGNLLKEEKQRAKLQKMLP 
KJLEEELKARIELWEQEHSKAFMVNGQKFME 
Y VAEQ WEMHRLEKERAKQERQLKNKKQTET 
EMLYGSAPRTPSKRRGLAPNTPGKARKLNTT 
TMSNATANSSIRPIFGGTVYHSPVSRLPPSGSK 
PVAASTCSGKKTPRTGRHGANKENLELNGSI 
LSGGYPGSAPLQRNFSINSVASTYSEFADPSLS 
DSSTVGLQRELSKASKSDATSGILNSTNIQS 


551 


1901 


A 


4194 


3 


1008 


AWHEGLVSSPAIGAYLSASYGDSLVVLVATV 

VALLDICFILVAVPESLPEKMRPVSWGAQISW 
KOADPFASl KKVOKF^TVl 1 UPTTVPT cjVf PF 

AG\QYSSFF\LYLR\QVIGFG\TVKIAAFIAMVGI 

LSIVAQTAFLSILMRSLGNKNTVLLGLGFQML 

QLAWYGFGSQAWMMWAAGTVAAMSSITFP 

AISALVSRNAESDQQGVAQGIITG1RGLCNGL 

GPALYGFIFYMFHVELTELGPKLNSNNVPLQ 

GAV1PGPPFLFGAC1VLMSFLVALFIPEYSKAS 

GVQKHSNSSSGSLTNTPERGSDEDIEPLLQDS 

SIWELSSFEEPGNQCTEL 
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552 



1902 



4197 



14302 



ARPPPAPGSRQQKQKAAPGAAAAAELRGAR 

EPAPARRRGTMADGG EGEDEIQFLRTDDE W 

LQCTATIHKEQQKLCLAAEGFGNRLCFLESTS 

NSKNVPPDLSICTFVLEQSLSVRALQEMLANT 

VEKSEGQVDVEKWKFMMKTAQGGGHRTLL 

YGHAILLRHSYSGMYLCCLSTSRSSTDKLAFD 

VGLQEDTTGEACWWTOIPASKQRSEGEKVR 

VGDDLILVSVSSERYLHLSYGNGSLHVDAAF 

QQTLWSVAPISSGSEAAQGYLIGGDVLRLLH 

GHMDECLTVPSGEHGEEQRRTVHYEGGAVS 

VHARSLWRLETLRVAWSGSHIRWGQPFRLR 

HVTTGKYLSLMEDKNLLLMDKEKADVKSTA 

FTFRSSKEKLDVGVRKEVDGMGTSEIKYGDS 

VCYIQHVDTGLWLTYQSVDVKSVRMGSIQR 

KAIMHHEGHMDDGISLSRSQHEESRTARVIRS 

TVFLFNRFIRGLDALSKKAKASTVDLPIESVSL 

SLQDLIGYFHPPDEHLEHEDKQNRLRALKNR 

QNLFQEEGMINLVLEC1DRLHVYSSAAHFAD 

VAGREAGESWKSILNSLYELLAALIRGNRKN 

CAQFSGSLD WUSRLERLEAS SGILEVLHCVL 

VESPEALNIIKEGHIKSIISLLDKHGRNHKVLD 

VLCSLCVCHG VAVR SNQHLICDNLLPG RDLL 

LQTRLVNHVSSMRPNIFLGVSEGSAQYKKWY 

YELMVDHTEPFVTAEATHLRVGWASTEGYSP 

YPGGGEEWGGNGVGDDLFSYGFDGLHLWSG 

CIARTVSSPNQHLLRTDDVISCCLDLSAPSISF 

RINGQPVQGMFENFNIDGLFFPWSFSAGDCV 

RFLLGGRHGEFKFLPPPGYAPCYEAVLPKEKL 

KVEHSREYKQERTYTRJDLLGPTVSLTQAAFT 

PIPVDTSQIVLPPHLERIREKLAENIHELWVMN 

KIELGWQYGPVRDDNKRQHPCLVEFSKLPEQ 

ERNYNLQMSLETLKTLLALGCHVGI SDEHAE 

DKVKKMKLPKNYQLTSGYKJPAPMDLSFIKLT 

PSQEAMVDKLAENAHNVWARDRIRQGWTY 

GIQQDVKNRRNPRLVPYTPLDDRTKKSNKDS 

LREAVRTLLGYGYNLEAPDQDHAARAEVCS 

GTGERFRTFRAEKTYAVKAGRWYFEFETVTA 

GDMRVGWSRPGCQPDQELGSDERAFAFDGF 

KAQRWHQGNEHYGRSWQAGDWGCMVDM 

NEHTMMFTLNGEILLDDSGSELAFKDFDVGD 

GFIPVCSLGVAQVGRMNFGKDVSTLKYFTIC 

GLQEGYEPFAVNTNRDITMWLSKRLPQFLQV 

PSNHEHIEVTRIDGTIDSSPCLKVTQKSFGSQN 

SNTOIMFYRLSMPJOECAEVFSKTVAGGLPGAG 

LFGPKNDLEDYDADSDFEVLMKTAHGHLVP 

DRVDKDKEATKPEFNNHKDYAQEKPSRLKQ 

RFLLRRTKPDYSTSHSARLTEDVLADDRDDY 

DFLMQTST YYYS VRIFPGQEPANV W VG WITS 

DFHQYDTGFDLDRVRTVTVTLGDEKGKVHE 

SIKRSNCYMVCAGESMSPGQGRNNNGLEIGC 

VVDAASGLLTFIANGKELSTYYQVEPSTKLFP 

AVFAQATSPNVFQFELGRIKNVMPLSAGLFKS 

EHKNPVPQCPPRLHVQFLSHVLWSRMPNQFL 

KVDVSRI SERQG WLVQCLDPLQFMSLHIPEEN 

RSVDILELTEQEELLKFHYHTLRLYSAVCALG 

NHRVAHALCSHVDEPQLLYAIENKYMPGLLR 

AGYYDLLIDIHLSSYATARLMMNNEYIVPMT 

EETKSITLFPDENKKHGLPGIGLSTSLRPRMQF 

SSPSFVSISNECYQYSPEFPLDILKSKTIQMLTE 

AVKEGSLHARDPVGGTTEFLFVPLIKLFYTLLI 
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MGIFHNEDLKHILQLIEPSVFKEAATPEEESDT 

LEKELSVDDAKLQGAGEEEAKGGKRPKEGLL 

QMKLPEPVKLQMCLLLQYLCDCQVRHRIEAI 

VAFSDDFVAKLQDNQRFRYNEVMQALNMSA 

ALTARKTKEFRSPPQEQINMLLNFKDDKSECP 

CPEEIRDQLLDFHEDLMTHCGIELDEDGSLDG 

NSDLTIRGRLLSLVEKVTYLKKKQAEKPVES 

DSKKSSTLQQLISETMVRWAQESVIEDPELVR 

AMFVLLHRQYDGIGGLVRALPKTYTINGVSV 

EDTINLLASLGQIRSLLSVRMGKEEEKLMIRG 

LGDIMNNKWYQHPNLMRALGMHETVMEV 

MVNVLGGGESKEITFPICMVANCCRFLCYFCR 

ISRQNQKAMFDHLS YLLEN SS VGL ASPAMRG 

STPLDVAAASVMDNNELALALREPDLEKVVR 

YLAGCGLQSCQMLVSKGYPDIGWNPVEGER 

YLDFLRFAVFCNGESVEENANVVVRLL1RRPE 

CFGPALRGEGGNGLLAAMEEAIKIAEDPSRD 

GPSPNSGSSKTLDTEEEEDDTIHMGNAIMTFY 

SALIDLLGRCAPEMHLIHAGKGEAIRIRSILRS 

LIPLGDLVGVISIAFQMPTIAKDGNVVEPDMS 

AGFCPDHKAAMVLFLDRVYGIEVQDFLLHLL 

EVGFLPDLRAAASLDTAALSATDMALALNRY 

LCTA VLPLLTRCAPLF AGTEHHA SLIDSLLHT 

VYRLSKGCSLTKAQRDSIEVCLLSICGQLRPS 

MMQHLLRRLVFDVPLLNEHAKMPLKLLTNH 

YERCWKYYCLPGGWGNFGAASEEELHLSRK 

LFWGIFDALSQKKYEQELFKLALPCLSAVAG 

ALPPDYMESNYVSMMEKQSSMDSEGNFNPQ 

PVDTSNITIPEKLEYFINKYAEHSHDKWSMDK 

LANGWIYGEIYSDSSKVQPLMKPYKLLSEKE 

KEIYRWPIKESLKTMLARTMRTERTREGDSM 

ALYNRTRRJSQTSQVSVDAAHGYSPRAIDMS 

NVTL SRDLHAMAEMMAENYHM W AKXKKM 

ELESKG GGNHPLLVP YDTLTAKEKAKDREKA 

QDILKFLQINGYAVSRGFKDLELDTPSIEKRFA 

YSFLQQLIRYVDEAHQYILEFDGGSRGKGEHF 

PYEQEIKFFAKVVLPLIDQYFKNHRLYFLSAA 

SRPLCSGGHASNKEKEMVTSLFCKLGVLVRH 

RISLFGNDATSIVNCLHILGQTLDARTVMKTG 

LESVKSALRAFLDNAAEDLEKTMENLKQGQF 

THTRNQPKGVTQITNYTTVALLPMLSSLFEHI 

GQHQFGEDLJLEDVQVSCYRILTSLYALGTSK 

SIYVERQRSALGECLAAFAGAF*PVAFLETHLD 

KHNIYSIYNTKSSRERAALSLPTNVEDVCPNIP 

SLEKLMEEIVELAESGIRYTQMPHVMEVILPM 

LCSYMSRWWEHGPENNPERAEMCCTALNSE 

HMNTLLGNILKJIYNNLGIDEGAWMKRLAVF 

SQPIINKVKPQLLKTHFLPLMEKLKKKAATVV 

SEEDHLKAEARGDMSEAELL1LDEFTTLARDL 

YAF'YPLLIRFVDYNRAKWLKEPNPEAEELFR 

MVAEVFIYWSKSHNFKREEQNFVVQNEINN 

MSFLITDTKSK M S JC A A V^DOF R K K MK MKdrt 

RYSMQTSLIVAALKRLLPIGLNICAPGDQELIA 
LAKNRFSLKDTEDEVRDIIRSNIHLQGKLEDP 
AIRWQMALYKDLPNRTDDTSDPEKTVERVL 
DIANVLFHLEQKSKRVGRRHYCLVEHPQRSK 
KA V WHKLLSKQRKRA V VACFRMAPL YNLPR 
HRAVNLFLQGYEKSWIETEEHYFEDKLIEDLA 
KPGAEPPEEDEGTTCRVDPLHQLILLFSRTALT 
EKCKLEEDFL YMA Y ADIMAKSCH DEEDDDG 
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EEEVKSFEEKEMEKQKLLYQQARLHDRGAA 

EMVLQTISASKGETGPMVAATLKLGIAILNGG 

NSTVQQKMLDYLKEKKDVGFFQSLAGLMQS 

CSVLDLNAFERQNKAEGLGMVTEEGSGEKV 

LQDDEFTCDLFRFLQLLCEGHNSDFQNYLRT 

QTGNNTTVNIII STVDYLLRVQESI SDFY WYY 

SGKDV1DEQGQRNFSKAIQVAKQVFNTLTEYI 

QGPCTGNQQSLAHSRJLWDAWGFLHVFAHM 

QMKLSQDSSQIELLKELMDLQKDMWMLLS 

MLEGNWNGTIGKQMVDMLVESSNNVEMIL 

KFFDMFLKLKDLTSSDTFKEYDPDGKG VI SK. 

RDFHKAMESHKHYTQSETEFLLSCAETDENE 

TLDYEEFVKRFHEPAKDIGFNVAVLLTNLSEH 

MPNDTRLQ1TLELAESVLNYFQPFLGRIEIMG 

SAKRIERVYFEISESSRTQWEKPQVKESKRQFI 

FDWNEGGEKEKMELFVNFCEDTIFEMQLAA 

QISESDLNERSANKEESEKERPEEQGPRMAFF 

SILTVRSALFALRYNILTLMRMLSLKSLKKQM 

KKVKKMTVKX)MVTAFFSSYWSIFMTLLHFV 

ASVFRGFFRIICSLLLGGSLVEGAKKIKVAELL 

ANMPDPTQDEVRGDGEEGERKPLEAALPSED 

LTDLKELTEESDLLSDIFGLDLKREGGQYKJLIP 

HNPNAGLSDLMSNPVPMPEVQEKFQEQKAK 

EEEKEEKEETKSEPEKAEGEDGEKEEKAKED 

KGKQKXRQLHTHRYGEPBVPESAFWKKHAY 

QQKLLNYFARNFYNMRMLALFVAFAINFILL 

FYKVSTSSWEGKELPTRSSSENAKVTSLDSS 

SHRIIAVHYVLEESSGYMEPTVRJLPILHTVISF 

FCIIGYYCLKVPLVIFKREKEVARKLEFDGLYI 

TEQPSEDDIKGQWDRLVINTQSFPNNYWDKF 

VKRKVMDKYGEFYGRDR1SELLGMDKAALD 

FSDAREKKK PKKDSSLS AVLN SIDVKYQM W 

KLGWFTDNSFLYLAWYMT 


553 


1903 


A 


4199 


31 


767 


LPELNGRGAGLRRAEPSERGGGAERTQQVAA 
LPLSHGHSHGGGGCRCAAER/VGAARGSAAC 
AYGLYLRIDKGRLQCLNESREGSGRGVFKPW 
ERAD\DRSKFVESDADEELLFNIPFTGVHVKLK 
GI IIMGEDDDSHPSEMRL YKNIPQMSFDDTER 
EPDQTFSLNRDLTGELEYATKISRFSNVYHLSI 
H1SKNFGADTTKVFYIGLRGEWTELRRHEVTI 
CNYEASANPADHRVHQVTPQTHFIS 


554 


1904 


A 


4200 


1 


961 


GIPCTEMGNFDNANVTGEIEFA1HYCFKTHSL 

EICIKACKNLAYGEEKKKKCNPYVKTYLLPD 

RSSQGKRKTGVQRNTVDFIrQETLKYQVAPA 

QLVTRQLQVSVWHLGTLARRVFLGEVIIPLAT 

WDFEDSTTQSFRWHPLRAKADKYEDSVPQS 

NGELTVRAKLVLPSRTRKLQEAQEGTDQPSL 

HGQLCLWLGAKNLPVRPDGTLNSFVKGCLT 

LPDQQKLRLKSPVLRKQACPQWKHSFVFSGV 

TPAQLRQSSLELTVWDQALFGMNDRLLGGT\ 

RLGSKGDTAVGGDACSQSKLQWQKVLSSPN 

LW 1 DM 1 -LVLll 


555 


1905 


A 


4211 


331 


2419 


KENKKARNLRMNQSRSRSDGGSEETLPQDH 

NHHENERRWQQERLHREEAYYQFINELNDE 

DYRLMRDHNLLGTPGEITSEELQQRLDGVKE 

QLASQPDLRDGTNYRDSEVPRESSHEDSLXE 

WLNTFRRTGNATRSG QNGNQT WRAVSRTNP 

NNGEFRFSLE1HVNHENRGFEIHGEDYTDIPLS 

DSNRDHTANRQQRSTXSPVARRTRSQTSVNFN 

GSSSNIPRTRLASRGQNPAEGSFSTLGRJLRNGI 
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GGAAGIPRANASRTNFSSHTNQSGGSELRQRE 

GQRFGAAHVWENGARSNVTVRNTNQRLEPI 

RLRSTSNSRSRSPIQRQSGTVYHNSQRESRPV 

QQTTRRSVRRRGRTRVFLEQDRERERRGTAY 

TPFSNSRLVSRITVEEGEESSRSSTAVRRHPTIT 

LDLQVRVRIRPGENRDRDSIANRTRSRVGLAE 

NTVTIESNSGGFRRTISRLERSGIRTYVSTITVP 

LRRISENELVEPSSVALRSILRQIMTGFGELSSL 

MEADSESELQRNGQHLPDMHSELSNLGTDN 

NRSQHREGS SQDRQAQGDSTEMHGENETTQP 

HTRN SD SRGGRQLRNPNN L VETGTL PILRLAH 

FFLLNESDDDDRIRGLTKEQIDNLSTRHYEHN 

SIDSELGKICSVCISDYVTGNKLRQLPCMHEF 

JHIJHCI UR WL S JEN CTCPICRQP V L GSN IANNG 


556 


1906 


A 


4212 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKR 
KSPENTEGKDGSKVTKQEPTRRSARLSAKPA 
PPKPEPKPRKTS AKKEPGAK1 SRG AKGKKJEEK 
QEAGKEGTAPSENGETKAEEIHISRSTVNVST 
SRGTPPSTLSVKGQIETVRVKGTEN 


557 


1907 


A 


4213 


774 


507 


ARRFSCLTLQTSWGHRH\GPPRP\ANFVFLVET 

GFLHIGQAGHKLPTSGDPPASASQSARITGMS 

HRTWFLASFLIDSCKNFIVYKIMYTL 


558 


1908 


A 


4225 


3 


1253 


TYRHAEREHPETSSATKVSYDYRHKRPKLLD 

GDQDFSDGRTQKYCKEEDRKYSFQKGPLNRE 

LDCFNTGRGRETQIXjQVKEPFKPSKKDSIAC 

TYSNKNDVDLRSSNDKWKEKKKKEGDCRKE 

SNSSSNQLDKSQKLPDVKPSPINLRKKSLTVK 

VDVKKTVDTFRVASSYSTERQMSHDLVAVG 

RKSENFHPVFEHLDSTQNTENKPTGEFAQEnT 

IIHQVKANYFPSPGITLHERFSXKMAOIHKADV 

NEIPLNSDPEIHRRIDMSLAELQSKQAVIYESE 

QTL IKI1DPNDLRHDIERRRKERLQNEDEHIFHI 

ASAAERDDQNSSFSKNYTTQRKDIITHKPFEV 

EGNHRNTRVRPFKSNFRGGRCQPNYKSGLVQ 

KSLYIQAKYQRLRFTGPRGFITHKFRERLMRK 
KKVP 


559 


1909 


A 


4235 


1 


323 


KFSIPFFLRWSFTLWPRLEGNDMISVHCNLGL 
LGLSHSPASASQVGGITGTQHHTGLIFGFLIET 
EFHHVGQAGLELLTSGDPPALAFQSAGITGVS 
HHAWLQVLNS 


560 


1910 


A 


4246 


2 


1569 


TLSLLERVLMKDIVTPVPQEEVKTVIRKCLEQ 

AAL VN YSRL SE YAKIEGKKREMYELP VFCLA 

SQVMDLTIQNQKDAENVGRLITPAKKLEDTIR 

LAELVIEVLQQNEEHHAEAFAWWSDLMVEH 

AETFL SLFA VDMDAALEVQPPDT WDSFPLFQ 

LLNNDFLRTGLLICGNGKXFHKHLQDLFAPLVV 

R/YMWDLDGSSPIAQSIHRGLLSRESWEPVNN 

GSGTSEDLFWKLDALQTFIRDLHWPEEEFGK 

HLEQRLKLMASDMIESCVKRTR\IAFEVKLQK 

TSSIQQIFRVPQFNMAPCFNVMGLMAKGSIQP 

1/ T \PC\/tPN/(r»nPPi V\,1\X7T4r > kVT-rCV r TPilTT TrCT\/ 
iVL« \\_.OiYlHiYlO V^d* /UvJYl W I noNJlJtljltEi 1 V 

KEMITLLVAXFVTILEGVLAKLSRYDEGTLFS 

SFLSFTVKAASKYVDVPKPGMDVADAYVTF 

VRH SQDVLRDKVNEEMY1ERLFDQ WYN SSM 

NVICTWLTDRMDLQLHIYQLKTLIRMVKKTY 

RDFRLQGVLDSTLNSKTYETIRNRLTVEEATA 

SVSEGGGLQGISMKDSDEEDEEDD 


561 


1911 


A 


4257 


1300 


654 


SELVQFLLIKDQKKIPIKRADILKHVIGDYKDI 
FPDLFKRAAERLQYVFGYKLVELEPKSNTY1L 
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INTLEPVEEDAEMRGDQGTPTTGLLMIVLGLI 
FMKGNTIKETEAWDFLLALNGVYPTKKHLIFG 
DPKKL1TEDFVRQRYLEYRR1PHTDPVDYEFQ 
WGPRTNLETSKMK VLKF VAK VHNQDPKD W 
PAQYCEALADEENRARPQPSGPAPSS 


562 


1912 


A 


4260 


1 


1498 


MVTWLYRFLPTSNMAAKLRSLLPPDLRLQF 

WLHARLQKCFLSRGCGSYCAGAKASPLPGK 

MAMGLMCGRRELLRLLQSGRRVHS V AGP SQ 

WLGKPLTTRLLFPAAPCCCRPHYLFLAASGPR 

SLSTSAISFAEVQVQAPPWAATPSPTAVPEV 

ASGETADVVQTAAEQSFAELGLGSYTPVGLI 

QNLLEFMH VDLGLPW WG AI AACTVF ARCLIF 

PL1VTGQREAARIHNHLPEIQKFSSRIREAKLA 

GDHIEYYKASSEMALYQKKHGIKLYKPLILPV 

TQAPIFISFFIALREMANLPVPSLQTGGLWWF 

QDLTVSDPIYILPLAVTATMWAVLELGAETG 

VQSSDLQWMRNVIRN1MPLITLPITMHFPTAV 

FMYWLSSNLFSLVQVSCLRIPAVRTVLKIPQR 

WHDLDKLPPREGFLESFKKGWKNAEMTRQ 

LREREQRMRNQLELAARGPLRQTFTHNPLLQ 

PGKDNPPNIPSS\SSSSSKPKSKYPWHDTLG 


563 


1913 


A 


4265 


623 


116 


MGGLAPTQTLEPTVREYQNTQLSVSYLLPEQN 

THGTRRTLSSGPSNNLPLPLSSSATMPSMQCK 

HRSPNGGLFRQSPVK/TPPIPMSFQPVPGGVNL 

PRGSGNPPHGTSILTAPPALLPHPPTHPTQQSF 

LIQENNNTNHTH SHTHT YTET LS FFL YIC VNN 

DRMEWGKSVF 


564 


1914 


A 


4270 


3 


368 


ILKRKLSSLNSEVSTIQNTRMLAFKATAQLFIL 
GCTW CLGLLQVGPAAQVMAYLFTIINSLQGF 
FIFLVYCLLS\QQVQKQYQKWFREIVKSKSES 
ETYTLS SKMGPDSKPSEGDVFPRTSE 


565 


1915 


A 


4288 


83 


406 


RNSRPLWCSPPASQPRQAPVSQSCCCPLPSSSS 
PPSALLAPTKPRALGTLRLYECSPELCTTMLP 
PAWLLMLCQAPRPQDPDPRLTQPEKSLQEAP 
GQTGASRTPRT 


566 


1916 


A 


4298 


1041 


229 


LNSSQKLACLIGVEGGHSLDSSLSVLRSFYVL 

GVRYLTLTFTCSTPWAESSTKFRHHMYTNVS 

GLTSFGEKWEELNRLGMMIDLSYASDTLIRR 

VLEVSQAPVIFSHSAARAVCDNLLNVPDDILQ 

LLKKNGGIVMVTLSMGVLQCNLLANVSTVA 

DHFDHIRA VIG SEFIG IGGN YDGTGRFPQGL\E 

DVSTYPVLIEELLSRSWSEEELQGVLRGNLLR 

VFRQVEKVREESRAQSPVEAEFPYGQLSTSCH 

FHLGASEWTPRLLIWR 


567 


1917 


A 


4299 


1 


1106 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFE 

DFPETSEPVWILGRKYSIFTEKDEILSDVASRL 

WFTYRKNFPAIGGTGPTSDTGWGCMLRCGQ 

MIFAQALVCRHLGRDWRWTQRKRQPDSYFS 

VLNAFIDRKDSYYSIHQIAQMGVGEGKSIGQ 

WYGPNTVAQVLKKLAVFDTWSSLAVH1AMD 

NTVVMJEE1RRLCRI bVrCAGAl ArrADoDKii 

CNGFPAGAEVTNRPSPWRPLVLLIPLRLGLTD 

INEAYVETL1CHCFMMPQSLGVIGGKFNSAHY 

FIGYVGEELIYLDPHTTQPAVEPTDGCFIPDES 

FHCQHPPCRMSIAELDPSIAVVRGGHLSTQAF 

GAECCL GMTRKTFGFLRFFFSMLG 


568 


1918 


A 


4300 


2012 


1843 


SRKFLTITPIVL YFLT SF YTK YDQIHF VLNT VS 
LMSVLIPKLPQLHGVR1FGINKY 


569 


1919 


A 


4302 


186 


531 


WTFCLFL/WWVPESARWLLTQGHVKEAHRY 
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LLHCARLNGRPVCEDSFSQEVRVNVCVSMHt 
CVWWGVGCVKCLPPRAHHIWQEKPLGPHRT 
VTESKLEAEGKTKEKAREKERKKKS 


570 


1920 


A 


4308 


3 


869 

4 


RSGQGKVYGLIGRRRFQQMDVLEGLNLLITIS 

O K R N KLK V Y Y L 3 W JL.K NJvlLJnJN JJr h. V c.ls_N.l^O 

WTTVGDMEGCGHYRWKYERIKFLVIALKSS 

VEVYAWAPKPYHKJFMAFKSFADLPHRPLLV 

DLTVEEGQRLKVIYGSSAGFHAVDVDSGNSY 

DIYIPVHIQSQITPHAIIFLPNTDGMEMLLCYE 

DEGVYVNTYGRI1KDVVLQWGEMPTSVAYIC 

bNQIMOWObKAJUblKbviii uHLUuvrMHKKA 

QRLKFLCERNDKVFFASVRSGGSSQVYFMTL 

NRNCIMNW 


571 


1921 


A 


4309 


9 


524 


ASREMDVTKVCGEMRYQLNKTNMEKDEAE 

KJEHREFRAKTNRDLEIKDQEIEKLRIELDESK 

QHLEQEQQKAALAREECLRLTELLGESEHQL 

HLTRQEKDSIQQSFSKEAKAQALQAQQREQE 

LTQKIQQMEAQHDKTENEQYLLLTSQNTFLT 

KLKEECCTLAKKLEQISQ 


572 


1922 


A 


4318 


1 


1119 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFE 

DFPETSEPVWILGRKYSIFTEKDEILSDVASRL 

WFTYRKNFPAIGGTGPTSDTGWGCMLRCGQ 

MIFAQALVCRHLGRDWRWTQRKRQPDSYFS 

VLNAFIDRKDSYYSIHQIAQMGVGEGKSIGQ 

WYGPNTVAQVLKKLAVFDTWSSLAVHIAMD 

NTWMEEIRRLCRTSVPCAGATAFPADSDRH 

CNGFPAGAEVTNRPSPWRPLVLLIPLRLGUT 

DINEAYV\ETL\KHCFHGWPQFPG/WHREGK 

PNSAHYFIGYVGEELIYLDPHTTQPAVEPTDG 

CFIPDESFHCQHPPCRMSIAELDPSIAWRGGH 

LSTQAFGAECCLGMTRKTFGFLRFFFSMLO 


573 


1923 


A 


4333 

+ 

| 

! 


363 


1066 


GGVPVGLASKPFQILYGHTNEVLSVGISTELD 
MAVSGSRDGTVIIHTIQKGQYMRTLRPPCESS 
LFLTIPNLAISWEGHIWYSSTEEKTTLK\ERM 
HYICFSINGKYLGSQILKEQVSDICIIGEHIVTG 
SIQGFLSIRDLHSLNLSINPLAMRLPIHCVCVT 
KEYSHILVGLEDGKLIVVGVGKPAEVKPSISN 
FISHAVGDYFGSPSFQLIEKSPLGINKLKAKFD 
FSKGSK 


574 


1924 


A 


4346 


359 


1234 


\fl}TLEEVTWANGSTALPPPLAIWSVPHRLLL 

LLYEDIGTSRVRYWDLLLLIPNVLFLIFLLWK 

LPSARAKJRITSSPIFITFYILVFVVALVGIARA 

WSMTVSTSN AATV ADK1LWEITRFFLLAIEL 

SVIILGLAFGHLESKSSIKRVLAITTVLSLAYSV 

TQGTLEILYPDAHLSAEDFNIYGHGGRQFWL 

t ;on/-irrn irx/Pl i rr / Tl T>T/TOI VCD ¥ Of TJCDTJCTTV 

VSSCFFFLVYSLVVILPKI rLKbKJbLraXKar Y 
VYAGILALLNLLQGLGSVLLCFDIIEGLCCVD 
ATTFLYFSFFAPLIYVAFLRGFFGSEPKILF 


575 


1925 


A 


4360 


2038 


1512 


GCWWRHPWLASQRDCLDCRIQLAEKFVKAV 
SKPSRPDMNPIRVKEVYRLEEMEKIFVRLEM 

IfTIKTi^^TPVT <lVTnRnr»RWFVPMftl YFVRT 

JVIIJVUOOVJ 1 I IVJUo I 1 \Jl\lJLsi\Tir V I IViVJJ^ I 1 V IV. 1 

VNEPWTMGFSKSFKKKFFYNKKTKDSTFDLP 
ADSIAPFHICYYGRLFWEWGDGIRVHDSQKP 
QDQDKL SKED VLSFIQMHRA 


576 


1926 


A 


4365 


69 


500 


QVEGRQGREVKRTAWRISPVWRPARCRRRST 

PQP/PE/PGAQQQERHRQGEAPMQALDPRAEP 

GPQAQSHAACQPEPEPPRVLLDPTAARGGVQ 

GRP/GLSRHPGLAPHPQTHTPWPQSGRLPCAS 

EPLPLGGIRPTPGLEPKGRDLM 
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577 


1927 


A 


4366 


785 


502 


SAPPKKKNGVLFLSPRLKSSGAIWVHSTPTLW 
ASSN SRASTPKVAGITGARPHARIIFVFLIEMG 
FHNVGQAGL/DTLTLVICPPQPPKLLGLQM 


578 


1928 


A 


4367 


1 


221 


FFFFLKKSRCVTQAGVQGVPISLHPPPPGFKRF 

SRLSLLSSWDYRHP/HAANFCIFSRDGVVSPYW 

SGWSRTPDLR 


579 


1929 


A 


4383 


1 


224 


FETESHSVTQAGMQWHNLGSLQPMP/PGLKR 
FSCLRLQSSWDHRHAPPHLAHFCIFSRDGVSP 
CWPGWSSTPDLK 


580 


1930 


A 


4397 


410 


94 


SRLKPYSTNVTAKKLPATNIPNLDCFTAKLYQ 
WFKKGIMHILHELFQNKEEGAFPNS/FYEASFT 
LRPKSDRDIAKEESYSTISLLSTDTKILMSKYK 
QLKSSDL 


581 


1931 


A 


4414 


670 


3 


VLVHRQCGGILRLRRKEAVSVLDSADIEVTDS 

RLPHATIVDHRPQHRWLETCNAPPQLIQGKA 

RSAPKPSQASGHFSVELVRGYAGFGLTLGGG 

RDVAGDTPLAVRGLLKDGFVAQRCGRLEVGD 

L VLHINGESTQGLTVH AQA VERIRAG GPQLHL 

VIRRPLETHPGKPRGVGEPRKGVVPSWPDRSP 

DPGGPEVTGSRSSSTSLVQHPPSRTTLKKTRG 

SPE 


582 


1932 


A 


4424 


194 


449 


VLYIRKKKRLEKLRHQLMPMYNFDPTEEQDE 

LEQELLEHGRDAASVQAATSVQAMQGKTTL 

PS\QGPLQRPSRLVFT\DVANAIHV 


583 


1933 


A 


4435 


1 


166 


APGPPVPPPGSPPEQMPGPCPASMPP/DPPPGS 
PPEQMPGPCPVSAPP/GPPPGSPPEQMPGPCPV 
SAPPALLQDTSV 


584 


1934 


A 


4439 


1 


628 


SATPQQPSAPQHQGTLNQPPVPGMDESMSYQ 

APPQQLPSAQPPQPSNPPHGAHTLNSGPQPGT 

APATQH SQ AGP ATGQA YGPHTYTEPAKPKK 

GQQLWNRMKPAPGTVEVSSSTSRSDPLLLPPR 

ALAPTQRASTVVLAPSPT/SEKVQNHSGSSAR 

GNLSGKPDDWP/LGHERVCGALLHRL*VGGG 

QGFHGKAAQGGAAGAAAGRLGLYH 


585 


1935 


A 


4463 


10 


144 


HKPVTNSRDTQEVPLEKAKQVLKIIATFKHTT 
SIFDDFAHYEKRQ 


586 


1936 


A 


4464 


1309 


103 


LNAESYVSFTTKLDIPTAAKYEYGVPLQTSDS 

FLRFPSSLTSSLCTDNNPAAFLVNQAVKCTRK 

INLEQCEEIEALSMAFYSSPEILRVTDSRKKVPI 

TVQSrVIQSLNKTLTRREDTDVLQPTLVNAGH 

FSLCVNVVLEVKYSLTYTDAGEVTKADLSFV 

LGTV S SV V VPLQQKFEIHFLQENTQPVPLSGN 

PGYWGLPLAAGFQPHKGSGIIQTTNRYGQLT 

ILHSTTEQDCLALEGVRTPVLFGYTMQSGCK 

LRLTGALPCQLVAQKVKSLLWGQGFPDYVA 
PFGNSQGP/ADMLDWVPIHFITQSFNRKDSCQ 
LPGALV1EVKWTKYGSLLNPQAKIVNVTANLI 
SSSFPEANSGNERTILI STAVTFVDVS APAEAG 
FRAPP AINARLPFNFFFPFV 


587 


1937 


A 


4471 


614 


387 


LLGRASAC/LQLQSSW/D/HRPMLPYLANFVF 

CKDR/SFTWLPRLVLNSWLQVILLPWPPTGCD 

NKHEPPCPATKRRHSGSI 


588 


1938 


A 


4480 


1720 


1458 


HDLGSLQPPPPGFKRFSCLSLPSSWDYRLMPP 
CPANFCim/DFLVETGFHHVGQASHELLTSGD 
PPTSASQSAGITGMSYHTWFGES 


589 


1939 


A 


4487 


922 


332 


APVTTSPRVGQPW/RTALALRSLYRARPSLRC 
PPVELP WAPRRGHRLSPADDEL YQRTRI SLLQ 
REAAQAMYIDSYNSRGFMINGNRVLGPCALL 
PHSVVQWNVGSHQDITEDSFSLFWLLEPRIEI 
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VWGTGDRTERLQSQVLQAMRQRGIAVEVQ 
DTPNAC ATFNFLCHE G RVTG AALIPPPG GTSL 
TSLGQAAQ 


590 


1940 


A 


4492 


1 


472 


FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 

PFSCLSLPSSWDYRRPPLRPANFFVFLVETGFP 

RFSRDGLDLLT/S/GDPPTSASQSAGITGVSHR 

ARPKR1GEPRRKCGNAVVWPSTSLGDHRVTS 

VPHQGGLPGPIRVAPSSAGQREASQGPPGR 


591 


1941 


A 


4495 


1444 


1116 


IAARFTLAKT^QLKRPVmiDSIKKTRWIYT 
ME YYADTERNEIMSFVAGTWVELEAIIL SKLM 
LKDNWVEDTIPQGAVPCTATAEGMKRLLFAL 
EPWDSSCFPHPSSGV 


592 


1942 


A 


4496 


2 


919 


RTRPLFSGRPTRPVCTMSDERRLPGSAVGWL 

VCGGLSLLANA WGIL SVGAKQKKWKPLEFL 

LCTLAATHMLNVAVPIATYSVVQLRRQRPDF 

EWNEGLCKVFVSTFYTLTLATCFSVTSLSYHR 

MWMVCWPVNYlU J SNAI<aCQAGHTVMGIWM 

GSFILSALPAVGWHDTSERFYTHGCRFIVAEI 

GLGFGVCFLLLVGGSVAMGVICTAIALFQTL 

AVQVGRQADHRAFTVPTIVVEDAQGKRRSSI 

DGSEPAKTSLQTTGLVTTIVFIYDCLMGFPVL 

GPFSLADTHLSDLPYTWGDRDSGGACVM 


593 


1943 


A 


4506 


2 


193 


FFFEAESCSVPQAGVQRPDLGWLHAPPPVGSC 
HFPASASQVAGTTHARHHTQLIF\AFLVENGL 
C 


594 


1944 


A 


4507 


1327 


647 


KMAGGVRPLRGLRALCRVLLFLSQFCILSGG 

ESTEIPPYVMKLCPSNGLCSRLPADCIDCTTNFS 

CTYGKPVTFDCAVKPSVTCVDQDFKSQKNFII 

NMTCRFCWQLPETDYECTNSTSCMTVSCPRQ 

RYPANCTVR\DHVHCLGNRTFPKMLYCNWT 

GGYKWVYGLWLLRHHPRWGLGADRF\YLGP 

VAGTASGKLFSFGGLGIWTLIDVLUGVGYVG 

PADGSLYI 


595 


1945 


A 


4512 


533 


264 


FFFKMESYSVARLECSGA1SAPCNLHLLGSNN 
SPASASRV/AGNIGARHHTQQIFVLLVQMRVH 
Y VGQDGLDLL/NLM IH PPRSPKVLGLQ A 


596 


1946 


A 


4513 


3 


1674 


HASDHLYPNFLVNELILKQKQRFEEKRFKLD 

HSVSSTNGHRWQIFQDWLGTDQDNLDLANV 

NLMLELLVQKKKQLEAESHAAQLQILMEFLK 

VARRNKREQLEQIQKJELSVLEEDIKRVEEMS 

GLYSPVSEDSTVPQFEAPSPSHSSIIDSTEYSQP 

PGFSGSSQTKJCQPWYNSTI^SRJWCRLTAHFE 

DLEQCYFSTRMSRISDDSRTASQLDEFQEOLS 

ICF\TRYNSVRPL\ATLSYASDLYNGSQYKSLV 

FEFDRDCDYFAIAGVTKKIKVYEYDTVIQDA 

VDIHYPENEMTCNSKJSCISWSSYHKNLLASS 

DYEGTVILWDGFTGQRSKVYQEHEKRCWSV 

DFNLMDPKLLASGSDDAKVKLWSTNLDNSV 

ASIEAKANVCCVKFSPSSRYHI.AFGCADHCV 

HYYDLRNTKQPIMVFKGHRKAVSYAKFVSG 

EEI V b AS lTJbQLKL WN VG i CLRbr K.G H1N 

EKNFV\GLASNGDYIACGSENNSLYLYYKGLS 

KTLLTFKTOTVKSVLDKDRKJEDDTNEFVSAV 

CWRALPDGESNVLIAANS\QGTI\KVLELV 


597 


1947 


A 


4518 


536 


824 


RSLALSPGLECSGMISAHCNLHLLGSSDPPTS 

ASQVAEITSVRHHTWLIFCIVLGQMGFHHVGE 

QAGLELLTSWDPAILPSQSAGIIGMSPHAWPP 


598 


1948 


A 


4524 


1 


384 


FDTEFVNIGGDFDAAAGVFR\CRLPGAYFFSF 
TLGKLPRKTLSVKLMKKRDEVQAMIYDDGSS 
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RRREMQSQSVMLALRRGDAVWLLSHDHDG 
YGAYSNHGKYITFSGFLVYPDLAPAAPPGLG 
ASELL 


599 


1949 


A 


4526 


366 


776 


MGQPAPYAEGPIQGGDAGELCKCDFLVFTSP 

NPEAVCEAGTPAMFQTAWRQMESCSI/AQAG 

VQWRDPGSLHPPPLGFKRFSCLSLPSSWDYK 

HAPPHPANFCIFSRDQVSPCWPGWSRSLDLVI 

PPPWLPKVLGLQA 


600 


1950 


A 


4529 


776 


334 


FFFETESC YVAQ AGVQWCDLCSLQAPPPG\S S 
DPPASASRVAGTTGARHHTQLIFVFLVETGFH 
\MLARDGLKLLTSSDPPASASQSSWDYRREPP 
RLANFFVFLVETGSRYVAQAGVQWLFTGAIP 
LLISTGVLTCSVSDLGRFTPP 


601 


1951 


A 


; 4533 

r 


1460 


403 


HEVQESfflFLESEFSRGISDNYTLALITYALSS 

VGSPKAKEALNMLTWRAEQEGGMQFWVSSE 

SKLSDSWQPRSLDIEVAAYALLSHFLQFQTSE 

GIPIMRWLSRQRNSLGGFASTQDTTVALKALS 

EFAALMNTERTNIQVTVTGPS SPSPVKFLIDT 

HNRLLLQTAELADGTANGSV/SISANGFGFAI 

CQLNWYNVKASGSSRRRRSIQNQEAFDLDV 

AVKENKDDLNHVDLNVCTSFSGPGRSGMAL . 

ME VNLL SG FM VPSE AI SL SETVKK VE YDHGK 

LNLYLDSVNETQFCVNIPAVRNFKVSNTQDA 

SVSIVDYYEPRRQAVRSYNSEVKLSSCDLCSD 

VQRLPSL 


602 


1952 


A 


4540 


1963 


295 


MRAPGRPALRPLPLPPLLLLLLSSPWGRAVPC 

VSGGLPKPANITFLSINMKNVLQWTPPEGLQG 

VKVTYTVQYFIYGQKKWLNKSECRNINRTYC 

DLSAETSDYEHQYYAKVKAIWGTKCSKWAE 

SGRFYPFLETQIGPPEVALTTDEKSISWLTAP 

EKWKRNPEDLPVSMQQIYSNLK YNV SVLNT 

KSNRTWSQCVTNHTLVLTWXLEPNTLYCVHV 

ESFVPGPPRRAQPSEKQCARTLKDQSSEFKAK 

IIFWYVLPISITVFLFSVMGYSIYRYIHVG\KEK 

HFVANLILIYG\NEFDKRFFVPA\EKIV\TNFATL 

NIS\DDSKISHQDMSLLGKSSDVSSLNDPQPSG 

NLRPPQEEEEVKHLGYASHLMEIFCDSEENT\ 

EGTSFTQQESLSRTJPPDKTVIEYEYDVRTTDI 

C AGPEEQEL SLQEEVSTQGTLLESQAALA VL 

GPQTLQYSYTPQLQDLDPLAQEHTDSEEGPEE 

EPSTTLVDWDPQTGRLCIPSLSSFDQDSEGCE 

PSEGDGLGEEGLLSRLYEEPAPDRPPGENETY 

LMQFMEEWGLYYQMEN 


603 


.1953 


A 


4543 


3 


600 


YSAVEFVEQASGISDWWNPALRKRMLSDSGL 

GMIAPYYEDSDLKDLSHSRVLQSPVSSEDHAI 

LQAVIAGDLMKLIESYKNGGSLLIQGPDHCSL 

LHYAAETGNGEIVKYILDHGPSELLDMADSE 

TGETALHKAACQRNRAVCQLLVDAGASLRK\ 

TDSKGKTPQERAQQAXGDPDLAA/YTIESRQN 

YKVIG HEDLETA V 


OU4 


1 l\CA 

1954 


A 


4548 


3 


938 


QDNKVQNGSLHQKDTVHDNDFEPYLTGQAN 

QSNSYPSMSDPYLSSYYPPSIGFPYSLNEAPW 

STAGDPPIPYLTTYGQLSNGDHHFMHDAVFG 

QPGGLGNNIYQHRFNFFPENPAFSAWGTSGS 

QGQQTQSSAYGSSYTYPPSSLGGTWDGQPG 

FHSDTLSKAPGMNSLEQGMVGLKIGDVSSSA 

VKTVGSWSSVALTGVLSGNGGTNVNMPVS 

KPTSWAAIASKPAKPQPKMKTKSGPVMGGG 

LPPPPIKHNMDIGTWDNKGPVPKAPVPQQAP 
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D^AsDartic Acid E=Cilutamic Acid 
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M=Methioninc, N=Asparagine, P=Prolinc, 
Q=GIutamine, R=Arginine > S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown ( *= i Stop codon, 
/^possible nucleotide deletion, \=possib!e 
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SPQAAPQPQQVAQPLPAQPPALAQPQYQSPQ 
QPPQ 


605 


1955 


A 


4553 


2 


2304 


ILLQEKRNCLLMQLEEATRLTSYLQSQLKSLC 

ASTLTVSSGSSRGSLASSRGSLASSRGSLSSVS 

FTDIYGLPQYEKPDAEGSQLLRFDLIPFDSLGR 

DAPFSEPPGPSGFHKQRRSLDTPQSLASLSSRS 

SLSSLSPPSSPLDTPFLPASRDSPLAQLADSCE 

GPGLGALDRLRAHASAMGDEDLPGMAALQP 

HGVPGDGEGPHERGPPPASAPVGGTVTLRED 

SAKRLERRARRISACLSDYSLASDSGVFEPLT 

KRNEDAEEPAYGDTASNGDPQ IHV GLLRDSG 

SECLLVHVLQLKNPAGLAVKEDCKVHIRVYL 

PPLDSGTPNTYCSKALEFQVPLVFNEVFRIPV 

HSSALTLKSLQLYVCSVTPQLQEELLG1AQIN 

LADYDSLSEMQLRWHSVQVFTS\LNHQGRGR 

LGVQERAPPGTLHTPSPSPA/STDAVTVLLAR 

TTAQLQ A VEREL AEERAKLEYTEEEVL EMER 

KEEQAEAISERSWQADSVDSGCSNCTQTSPPY 

PFPPPMflTn^TT fiHPF A AOAfiPY^PFlf FfYPQPl 

KVDKETNTEDLFLEEAASL VKERPSRRARG SP 

FVRSGTIVRSQTFSPGARSQYVCRLYRSDSDS 

STLPRKSPFVRNTLERRTLRYKQSCRSSLAEL 

MARTSLDLELDLQASRTRQRQLNEELCALRE 

LRQRLEDAQLRGQTDLPPWVLRDERLRGLLR 

EAERQTRQTKLDYRHEQAAEKMLKKASKEI 

YQLRGQSHKEPIQVQTFREKIAFFTRPRINIPPL 

PADDV 


606 


1956 


A 


4555 


3429 


776 


PGSGPGPAPFLAPVAAPVGGISFHLQIGLSREP 

VLLLQDSSGDYSLAHVREMACS1VDQKFPEC 

GFYGMYDKILLFRHDPTSENILQLVKAASDIQ 

EGDUEVVLSASATFEDFQIRPHALFVHSYRA 

PAFCDHCGEMLWGLWRQGLKCEGCGLNYH 

KRCAFKIPNNCSGVRRRRLSNVSLTGVSTIRT 

SSAELSTSAPDEPLLQKSPSESFIGREKRSNSQ 

SYIGRPIHLDKILMSKVKVPHTFVIHSYTRPTV 

CQYCKKLLKGLFRQGLQCKDCRFNCHKRCA 

PKVPNNCLGEVTINGDLLSPGAESDVVMEEG 

SDDNDSERNSGLMDDMEEAMVQDAEMAMA 

ECQNDSGEMQDPDPDHEDANRTISPSTSNNIP 

LMRWQSVKHTKRKSSTVMKEGWMVtlYTS 

KDTLRKKHYWRLDSKCITLFQNDTGSRYYKE 

IPLSEILSLEPVKTSALIPNGANPHCFEITTANV 

VYYVGENVVNPSSPSPNNSVLTSGVGADVAR 

MWEIAIQHALMPVIPKGSSVGTGTNLHRDISV 

SISVSNCQIQENVDISTVYQIFPDEVLGSGQFGJ 

VYGGKHRKTGRDVAIKUDKLRFPTKQESQLR 

NEVAILONLHHPGVVNLECMFFTPFRVFVVM 

*U r 4 111 '-■—'A -1-1. AA. T T 1 ^ J^fl^ Vri/f V£J. 1. a A A AJl\ T A V T 1 VJ 

EKLHGDMLEMILSSEKGRLPEHITKFLITQILV 

ALRHLHFKNIVHCDLKPENVLLASADPFPQV 

KLCDFGFARUGEKSFRRSWGTPAYLAPEVL 

RNKGYNRSLDMWSVGVIIYVSLSGTFPFNED 

EDIHDQIQNAAFMYPPNPWKEISHEAIDLINN 

LLQVKMRKRYSVDKTLSHPWLQDYQTWLDL 

RELECKIGERYITHESDDLRWEK YAG EQGLQ 

YPTHLINPSASHSDTPETEETEMKALGERVSIL 


607 


1957 


A 


4563 


1 


4499 


SRPWWLRASERPSAPSAMAKRSRGPGRRCLL 

ALVLFCAWGTLAWAQKPGAGCPSRCLCFRT 

TVRCMHLLLEAVPAVAPQTSILDLRFNRIREI 

QPGAFRRLRNLNTLLLNNNQIKRJPSGAFEDL 

ENLKYLYLYKNEIQSIDRQAFKGLASLEQLYL 
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F-Phenyl alanine. GKjlycine. H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, 
M^Methionine, N=Asparagine, P=ProIine, 
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HFNQIETLDPDSFQHLPKLERLFLHNNR1THL 

VPGTFNHLESMKRLRLDSNTLHCDCEILWLA 

DLLKTYAESGNAQAAA1CEYPRRIQGRSVATI 

TPEELNCERPRITSEPQDADVTSGNTVYFTCR 

AEGNPKPEIIWLRNNNELSMKTDSRLNLLDD 

GTLMIQNTQETDQGIYQCMAKNVAGEVKTQ 

EVTLRYFGSPARPTFVIQPQNTEVLVGESVTL 

ECSATGHPPPRISWTRGDRTPLPVDPRVNITPS 

GGLYIQNVVQGDSGEYACSATNNIDSVHATA 

FIIVQALPQFTVTPQDRVVIEGQTVDFQCEAK 

GNPPPVIAWTKGGSQLSVDRRHLVLSSGTLRI 

SGVALHDQGQYECQAVNIIGSQKWAHLTVQ 

PRVTPVFASIPSDTTVEVGANVQLPCSSQGEP 

EPAITWNKDGVQVTESGKFHISPEGFLTINDV 

GPADAGRYECVARNTIGSASVSlvrVLSVNVPD 

VSRNGDPFVATSIVEAIATVDRAINSTRTHLF 

DSRPRSPNDLLALFRYPRDPYTVEQARAGEIF 

ERTLQLIQEHVQHGLMVDLNGTSYHYNDLVS 

PQYLNLIANLSGCTAHRRVNNCSDMCFHQKY 

RTHDGTCNNLQHPMWGASLTAFERLLKSVY 

ENGFNTPRGINPHRLYNGHALPMPRLVSTTLI 

GTETVTPDEQFTHMLMQWGQFLDIIDLDSTV 

VALSQARFSDGQHCSNVCSNDPPCFSVMIPPN 

DSRARSGARCMFFVRSSPVCGSGMTSLLMNS 

VYPREQINQLTSYIDASNVYGSTEHEARSIRD 

LASHRGLLRQGIVQRSGKPLLPFATGPPTECM 

RDENESPIPCFLAGDHRANEQLGLTSMHTLW 

FREHNRIATELLKXNPHWDGDTIYYETRKIVG 

AEIQHITYQH WLPKILGEVGMRTLGEYHG YD 

PGINAGIFNAFATAAAFRFGHTLVNPLLLPGLD 

ENFQPIAQDHLPLHKAFFSPFRIVNEGGIDPLL 

RGLFGVAGKMRVPSQLLNTELTERLFS1VIAHT 

VALDLAAINIQRGRJDHGIPPYHDYRVYCNLS 

AAHTFEDLKNEIKNPEIREKLKRLYGSTLNID 

LFPALWEDLVPGSRLGPTLMCLLSTQFKRLR 

DGDRLWYENPGVFSPAQLTQIKQTSLARILCD 

NADNITRVQSDVFRVAEFPHG YG SCDEIPRVD 

LRVWQDCCEDCRTRGQFNAFSYHFRGRRSLE 

FSYQEDKPTKKTRPRKIPSVGRQGEHLSNSTS 

AVFSTRSDASG\TNDFQRVCSWEMQKTITDLR 

TQIKKLESRVLSTTECYDAGGESHANNTKWK 

KDACTICECKDGQVTCFVEACPPATCAVPVNI 

PGACCPVCLQKRAEEKP 


608 


1958 


A 


4566 


354 


1135 


FSFLC/GVSGRLGLDSEEDYYTPQKVDVPKAL 

IIVAVQCGCDGTFLLTQSGKVLACGLNEFNKL 

GLNQCMSGIINHEAYHEVPYTTSFTLAKQLSF 

YKIRTIAPGKTHTAAIDERGRLLTFGCNKCGQ 

LGVGNYKKRLGINLLGGPLGGKQVIRVSCGD 

EFTIAATDDNHIFAWGNGGNGRLAMTPTERP 

HGSDICTSWPRPIFGSLHHVPDLSCRGWHTILI 

VEKVLNSKT1RSNSSGLSIGTVFQSSSPGGGGE 

GGPDAW 


609 


1959 


A 


4567 


1 


412 


FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 

PFSCLSLPSSWDYRRPPLRPANFFVFLVETGF 

HRFSRDGLDLLT/S/GDPPASASQSAGITGVSH 

RARPRINLRNVIYSFAVTYCLNYISLAMSSTL 

KLSFHVLSGS 


610 


1960 


A 


4570 


697 


467 


ECRGVISAH\CCTLCLPSSSDSASAF\RVARTT 
GTCDYAQLIFAFLVEMGFHHVGQDGLHLL/N 
LVIRPPRPPKVLGLQA 
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F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
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611 


1961 


A 


4571 


25 


1396 


ADPHTTVIRFFPAASATKRVLPPVLRVSSPRT 
WNPNVPESPRIPAPRLPKRMSGAPTAGAALM 
LCAATAVLLSAQGGPVQSKSPRFASWDEMN 
VLAHGLLQLGQGVCANTVGAHPQSAERAGAYR 
LSACGSACQGTEGSTDLPLAPESRVDPEVLHS 
LQTQLKAQNSRIQQLFHKVAQQQRHLEKQHL 
RIQHLQSQFGLLDHKHLDHEVAKPARRKRLP 
EMAQPVDPAHNVSRLHRLPRDCQELFQVGER 
QSGLFEIQPQGSPPFLVNCKMTSDGGWTVIQR 
RHDGSVDFNRPWEAYKAGFGDPHGEFWLGL 
EKVHS1TGDRNSRLAVQLRDWDGNAEIXQFS 
VHLGGED1AYSLQLTAPVAGQLGATTVPPSG 
LSVPFSTWDQDHDLRRDKNCAKSLSG GWW F 
" GTCSHSNLNGQYFRS1PQQRQKLKKG1FWKT 
WRGRYYPLQATTMLIQPMAAEAAS 


612 


1962 


A 


4575 


162 


3 


FFFETESRSVAQAGVQWRDLSSLQPPPPGXSR 
GSPASASPVAGITGTRHHRTRG 


613 


1963 


A 


4584 


687 


321 


PLAQRRPFLWVTVKTNGHIWGSSTYPHFWGS 
SNS/PASASQVAGIPNARHQARHFVFLVEPRF 
HHVGRAGLGFL/NLA1CLPQHPKVLGLQACN 
LNIKPHPAHKY1SMIQFNVHFMCMSVHIYI 


614 


1964 


A 


4589 


727 


299 


PGSAQSAQRGRGRRRARAGSATQITMYSFMG 
GGLFCAWVGTILLVVAMATDHWMQYRLSGS 
FAHQGLWRYCLGNKCYLQTDSIAYWNATRA 
FM1LS ALC AISG IIMGIMAF/G W V A VLMTFF A 
GIFYMCAYRVHECRRLSTPR 


615 


1965 


A 


4590 


2 


414 


TILPEKIQAWAQKQCPQSGEEAVALVVHLEK 

ETGRLRQQVSSPVHKEKHSPLGAAWEVADFQ 

PEQVETQPRAVSREEPGSLHSGHQEQLNRKR 

ERRPLPKNARPSPWVTALADEWNTLHQEVTT 

TRLPAGSQEPVKD 


616 


1966 


A 


4592 


773 


488 


DFALVAQAGVQWHNLGSPQPLPPGFKRFSCL 
SLPSSWEYRCVPP/RLANFVFLVEMGFLHVGQ 
AGLELPTSGDPPALASQSAGITGVTTVPSGPG 


617 


1967 


B 


4595 


84 


478 


XRHGLREPLLERRCAAASSFQHSSSLGRELPY 

DPVDTEGFGEGGDMQERFLFPEYILDPEPQPT 

REKQLQELQQQQEEEERQRQQRREERRQQNL 

RARSREHPVVGHPDPALPPSGVNCSGCGAEL 

HCQDAR* 


618 


1968 


A 


4596 


2945 


1188 


ARSRNSARGVYGMCVDTLFLCFLEDLERNDG 

SAERPYFMCSTLKKPLARRCFPAIHAYKGVL 

MVGNETTYEDGHGSRKNITDLVEGAKKANG 

VLEARQLAMK.1FED YTVS WYWIIIGL V I AMA 

MSLLS1ILLHLLAGIMGWVMIIMEASELGYR1F 

HCYMEYSRLRGEAGSDVSLVDLGFQTDFRV 

YLHLRQTWLAFMIILSILEVUILLLIFLRKRILI 

AIALIKEASRAVGYVMCSLLYPLVTFFLLCLCI 

AYWASTAVFLSTSNEAVYK1FDDSPCPFTAKT 

CNPETFPSSNE SRQCPN ARCQFAFY GGESG YH 

RALIXLQIFNAFMFFWLANFVLALGQVTLAG 

AFAS Y Y W ALRKPDDLPAFPLFS AFGRALRYH 

TGSLAFGALILAIVQ1IRVILEYLDQRLKAAEN 

KFAKCLMTCLKCCFWCLEKFIKFLNRNAYIM 

IAIYGTNFCTSARNAFFLLMRNIIRVAVLDKV 

TDFLFLLGKLLIVGSVGILAFFFFTHRIRIVQDT 

APPLNYYWVPILTVIVGSYLIAHGFFSVYGMC 

VDTLFLCFLEDLERNDGSAERPYFMSSTLKKL 

LNKTNKKAAES 


619 


1969 


A 


4601 


2 


357 


RTSVEPYILGEF/RKLSNNTKVVKTEYKATEY 
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M=Methiomne, N=Asparagine, P=Prolme, 
Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine a W=Tryptophan, 
Y=Tyrosine, X=Unknown, *= s Stop codon, 
/=possible nucleotide deletion, V-possible 
nucleotide insertion 














GLAYGHFSYEFSNHRDVWDLQGWVTGNGK 

GLIYLTDPQIHSVDQKVFTTNFGKRGIFYFFN 

NQHVECNEICHRLSLTRPSMEKPCKS 


620 


1970 


A 


4606 


1 


2415 


MERLWGLFQRAQQLSPRSSQTVYQRVEGPR 

KGHLEEEEEIXjEEGAETUVHFCPMELRGPEP 

LGSRPRQPNLIPWAAAGRRAAPYLVLTALLIF 

TGAFLLGYVAFRGSCQACGDSVLVVSEDVN 

YEPDLDFHQGRLYWSDLQAMFLQFLGEGRL 

EDTIRQTSLRERVAGSAGMAALTQDIRAALS 

RQKLDHVWTDTHYVGLQFPDPAHPNTLHWV 

DEAGK VGEQLPLEDPDVYCP Y SAIGNVTGEL 

VYAHYGRPEDLQDLRARGVDPVGRLLLVRV 

GVrSFAQKVTNAQDFGAQGVLIYPEPADFSQ 

DPPKPSLSSQQAVYGH VHLGTGDPYTPGFPS F 

NQTQFPPVASSGLPSIPAQPISADIASRJLLRKL 

KGPVAPQEWQGSLLGSPYHLGPGPRLRLVVN 

NHRTSTPINNIFGCIEGRSEPDHYVVIGAQRDA 

WGPGAAKSAVGTAILLELVRTFSSMVSNGFR 

PRRSLLFIS WDGGDFG S VGSTEWLEG YLS VL 

HLKAVVYVSLDNAVLGDDKFHAKTSPLLTSL 

IESVLKQVDSPNHSGQTLYEQVVFTN\PSWD\ 

AEVIRPLPM\DSSAY\SFTAFVGVPAVEFSFME\ 

DDQVAYPFLHTKEDTYENLHKVLQGRLPAVA 

QAVAQLAGQLLIRLSHDRLLPLDFGRYGDW 

LRFIIGNLNEFSGDLKARGLTLQWVYSARGDY 

IRAAEKLRQEIYSSEERDERLTRMYNVRIMRV 

EFYFLSQYVSPADSPFRHIFMGRGDHTLGALL 

DHLRLLRSNSSGTPGATSSTGFQ\ESRFRRQL\ 

ALUTWDACKGAANALSGDVWNIDNNF 


621 


1971 


A 


4610 


793 


334 


ISRVDDFVGSGIANVIIAVAIFSIPAFARLVRG\ 

NTLVLKQQTFIESARSIGASDMTVLLRHILPGT 

GSSIWFFTMRIGTSIISAASLSFLGLGAQPPTP 

EWGAMLNEARADMVIAPHVAVFPALAIFLTV 

LAFNLLGDGLRDALDPKIKG 


622 


1972 


A 


4614 


2 


820 


L V YVMI A IFC I A SAM S L YNCL AALI HKIP YG Q 

CTIACRGKNMEVRLIFLSGLCIAVAVVWAVF 

RNEDRW A WILQDILGI AFCLNL IKTLKLPNFK 

SCVTLLGLLLLYDVFFVFITPFITKNGESIMVEL 

AAGPFGNNEKNDGNLVEATGQPSAPHEKLPV 

VIRVPKLIYFSVMSVCLMPVSILGFGDIIVPGL 

LIAYCRRFDVQTGSSYIYYVSVXTVAYAIGMIL 

TFWLG\LMKKGQPALLYLVPCTLITA/CQFV 

AWETVREMKKFWERVTS 


623 


1973 


A 


4619 


17 


691 


TLVSVVEFVRRADLTREDLAPSSVDSGQAGF 

GGCCESGLPNTMPSAFSVSSFPVSIPAVLTQT 

DWTEPWLMGLATFHALCVLLTCLSSRSYRLQ 

IGHFLCLVILVYCAEYINEAAAMNWRLFSKY 

QYFDSRGMFISIVFSAPLLVNAMIIVVMWVW 

KTLNVMTDLKNAQERRKEKKRRRKED*GAA 

AAWSLRPSRPPSAAPSAAVCVAWASFQLTHG 

T lOsJBPFI 


624 


1974 


A 


4622 


164 


668 


VSCYTALQSIMNQPESANDPEPLCAVCGQAtt " 

SLEENHFYSYPEEVDDDLICHICLQALLDPLD 

TPCGHTYCTLCLTNFLVEKDFCPMDRKPLVL 

QHCKKSSILVNKLLNKLLVTCPFREHCTQVL 

QRCDLEHFIFQTSQAWGTHL'SQLLGRLRQED 
CLSPGVHHCSEV 


625 


1975 


A 


4625 


474 


473 


CFLSPSPLLPPLLLSSSSSPSFPLPPPPTLLPSTLP 
PPLLIPSS*LSP 
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of peptide 
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Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E^GIutamic Acid, 
F=Phenylalanine. G=Glycine, H— Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P= Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, v=possible 
nucleotide insertion 


626 


1976 


A 


4629 


249 


3 


KLKGNECFCYHCNVCIFLMIKK*GLFLC*IYFI 
LFFET*SHSFTOLECSGTISAHCSLQLQGSSNSP 
ASAS QVAGIAGTHH 


627 


1977 


A 


4635 


1 


301 


FFFFETKPFFAPQAGGQGPSRGSLNPLPTGLK 
QFSGLTLSRSGNNGPRPPPRVNFGILRGNGVP 
PGGAG* PRPPDLRGPPGLAPPQGGNNGGDPP 
ARAYL 


628 


1978 


A 


4648 

i 
1 

: 


1357 


782 


KLFS S QRLFGPHIQAINPSFLLLSFFPS* LL AMR 
TVGNNAFILVFLVYRIVLLLF*HV*PAYFQPSK 
NKTAKINCN+RPFLFLVCYLL*AELH1GIFIANF 
YD CI PNKLNEHL WPKLLQSL IFHVDF CGFLHK 
VFYICFTEFLLFLYFL*LFIIKVSCSII* CSTICVF 
SYKSFAVIIFFVDNTRFFSFGF 


629 


1979 


A 


4660 


18 


999 


HHELHTLELLQNPKEVLTRSEIQDVNYSLEAV 

KVKTVCQIPLMKEMLKRFQVAVNLAEDTAH 

PKXVFSQEGRYVKNTASASSWPVFSSAWNYF 

AGWRNPQKTAFVERFQHLSCVLGKNVFTSG 

KHYWEVESRDSLEVAVGVCREDVMG1TDRS 

KMSPDVGIWAIYWSAAGYWPLIGFPGTPTQQ 

EPALHRVGVYLDRGTGNVSFYSAVDGVHLH 

TFSCSSVSRLRPFFWLSPLASLVIPPVTDRK*G 

FSSPDQNSFPVVQLRDTHPWALFCPSCLYPG 

WSIFWVSLTVPFGICPLCASQEAVPWEVGLA 

NGDGTGNFPRRFWEIFL 


630 


1980 


A 


4669 


2 


358 


FFFFFETESHSVAQAGMQWRNLGSLPAPPPGF 
TPFFCLSLLNGWDYRRPPPHLANFFVLLVETG 
FHDVGQDGLDLLTS*STPSASQSAEITGVSHC 
TRLKKIRFAKGHVEFFFESHVE 


631 


1981 


A 


4674 


953 


614 


TPIRGTDDEHEECTVQEYSAGKNTCLRPGAV 
AHTCNPCTLGGRGRWIT*GSGVQDQPGPTWQ 
NPVFLERRPRALHSSPGLTTQRILWAQGLWV 
GAGSTGCSRGPRGEGVFREG 


632 


1982 


A 


4678 


34 


314 


RSTHASGMISPSFGFMGHLLRLEFEILPSTPNP 

*LPSYQGEAAGSSLISHLQTFSPDLKGVYCTFP 

ASGLAPVPTHWTVSELSRSPVATATFC 


633 


1983 


A 


4696 


1 


1365 


RTLGM EGERRASQAPSSGLPAGGANGESPGG 

GAPFPGSSGSSALLQAEVLDLDEDEDDLEVFS 

KDASLMDMNSFSPMMPTSPLSMINQIKFEDEP 

DLKDLFITVDEPESHVTTIETFITYRHTKTSRG 

EFDSSEFEVRRRYQDFLWLKGKLEEAHPTLII 

PPLPEKFIVKGMVERFNDDFIETRRKALHKFL 

NRIADHPTLTFNEDFKIFLTAQAWELSSHKKQ 

GPGLLSRMGQTVRAVASSMRGVKNRPEEFM 

EMNNFIELFSQKINLIDKISQR1YKEEREYFDE 

MKEYGPIHILWSASEEDLVDTLKDVASCIDRC 

CKATEKRMSGLSEALLPWHEYVLYSEMLM 

GVMKRRDQIQAELDSKVEVLTYKKADTDLL 

PEEIGKLEDKVECANNALKADWERWKQNM 

QNDIKLAFTDMAEENIHYYEQCLATWESFLT 

SQTNLHLEEASEDKP 


634 


i no a 

1984 


A 

A 


4708 


421 


158 


SYWVGEDYTYKFFEVILIDPFHKAIRRNPDTQ 
WISKAVYKHREMCGLTSTGRKSHGLEKDRM 
FPHAIGGSCRAA+RRRKTLQFPCYH 


635 


1985 


A 


4709 


42 


341 


YIKQPDAKERRRTVHWKKETESEASEITIPPST 
PGVPQAPGHWEDYGRGDNFYLPH*DPGGIVL 
WNIFNRMPIARKNITDGEHHEYLIEVPRLFHT 
SED 


636 


1986 


A 


4721 


2 


351 


EKPDHFFPEGTSFIHEPRRPN*GDLVHCLGGIS 
RSTTVTVA*LMQKLNLSMNDAYYIVIMKMSS 
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Amino acid sequence (A^AIanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, (^Glycine, H^Histidine, 
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M=Methionine, N=Asparagine, P^Proline, 
Q=<jlutamine, R=Arginine, S^erine, 
T-Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *~Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














I SPNFN SMDQPLDFQRTLGLRSPCYNRVPAQK 
MYFTTPSNHNAYQVDSVQST 


637 


1987 


A 


4726 


664 


253 


NTGLTCSIQRKCGETQLYRREENRLILLLQDH 

LKSESFQVLTLSPRLEFSGLISAHCNLRLPGSS 

DSSASSSRAAGITGVHHHAWLIFFFLVETGFL 

HAG*AGLELLTSGDPPASASRSAGITGVSHHA 

RPRETRFL 


638 


1988 


A 


4734 


24 


592 


GGMDSRVSGTTSNGETKPVYPVMEKKEEDG 

TLERGHWNNKMEFVLSVAGEIIGLGNVWRFP 

YLCYKNGGGAFFIPYLVFLFTCGIPVFLLETAL 

GQYTSQGGVTAWRKICPIFEGIGYASQMIVIL 

LNVYYIIVLAWALFYLFSSFTIDLPWGGCYHE 

WNTEHCMEFQKTNGSLNGTSENATSPVIEFW 


639 


1989 


A 


4743 


1040 


699 


QGLTLLPRMECSATITAHCSLELPGSIDLPTSA 
S*VARTTGTHHHPWLILVLLL*TWGSYYVAQ 
AGLELLGSSNLPAAMVSQSAQIIGHDHCAWA 
TSNHVLYTQEGLRRGKEG 


640 


1990 


A 


4771 


527 


2 


GRIDCPHPATVLAQPIFIDACSVLGAYQGAQN 

WIRRRPCXPSGCLKMNREIGPLQHSLCCPGWS 

QTPGLKAILLRQPPK*LGLQMESHSCPPAWSA 

MARSRLTATSASQVQAILLPQPPGTTDSCSPS 

PDHEQQPLSWVLPPPQKDMNPREQQVALGP 

QAAALPWAVWRNDCFPR 


641 


1991 


A 


4780 


16 


473 


RPSSQCGGIPTGWKKGLAPELSSELSSPPLPAR 

LQLAASPYFSPSWAECPQPVPAGTHATWCLA 

RVWARMTPPGPAGIPSHPLPPPPPERSVPIPSP 

FPARDSGSRQGHSTDRYKHTDAPRDAHRRVP 

QRDTDTGVHTGSGTHTHAHTPPEK 


642 


1992 


A 


4798 


1 


487 


GYSFRCDIVDYSRSPTALRMARTCWLYYFSK 

F1ELLDTIFFVLRKKNSQVTFLHVFHHTIMPW 

TWWFGVKFAAGGLGTFHALLNTAVHVVMY 

SYYGLSALGPAYQKYLWWKKYLTSLQLVQF 

VIVAIHISQFFFMEDCKYQFPVFACnMSYSFM 

FLLLFLH 


643 


1993 


A 


4799 


2 


391 


LMAFIEMHISGSLVYLKJKTKIYSYFSMLNFLL 

QEIPLSEILRISSPRDFTNISQGSNPHCFEIITDT 

MVYFVGENNGDSSHNPVLAATGVGLDVAQS 

WEKAIRQALMPVTPQASVCTSPGQGKDHSK 

Q*ASVCTSPGQGKDHSKQ 


644 


1994 


A 


4800 


488 


101 


AYPLFAVHPVHTECVAGWGRAYLLCALFFL 
LSFLGYCKAFRESNKEGAHSSTFWVLLSIFLG 
AVAMLCKEQGITVLVRAATWLGPAFSVCPFP 
SYKDIWGWPCLCGVLHAYIPLLV 


645 


1995 


A 


4805 


458 


126 


LLWTTVLCQTPARPQSTMIHLGHILFLLLLPV 
AAAQTTPGERSSLPAFYPGTSGSCSGCGSLSL 
PLLAGLVAADAVASLLIVGAVFLCARPRRSP 
AQEDGKVYINMPGRG 


646 


1996 


A 


4817 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNL 

LQPQAPGHDMTSIPFPGDRLLQVDGVILCGLT 

HKQAVQCLKGPGQVARLVLERRVPRSTQQC 

PSANDSMGDERTAVSLVTALPGRPSSCVSVT 

DGPKF*SSN*KRIANGLGFSFVQMEKESCSHL 

KSDLVRIKRLFPGHPAEENGA1AAGDIILGRE 

WEGPRKASSSRCRGSWAMQLSVQAGPSFAS 

YYPAAVEVLHLLRGAPQEVTLLLCRPPPGAL 

PELEQEWQTPELSADKEFTRATCTDSCTSPIL 

GSRGQLGGT VPPQM Q GKA WGLRPESSQKAIR 

EGTMGAKTERDLGPVP 


647 


1997 


A 


4854 


1044 


335 


PRVRGDWPLEKKKSNSNIHPIFSWCGSTDSFCD 
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D=A^narttp Ariri F=^"iliit8mic Arirl 
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Y— Tyrosine, X=Un known, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














IVMPTYDLTDSVLETMGRVSLDMMSVQANT 

GPPWESKNSTAVWRGRDSRKERLELVKLSRK 

HPELIDAAFI'NFFFFKHDENLYGPIVKHISFFD 

FFKHKYQINIDGTVAAYRLPYLLVGDSVVLK 

QDSIYYEHFYNELQPWKHYIPVKSNLSDLLEK 

LKWAKDHDEEAKKIAKAGQEFARNNLMGD 

DIFCYYFQTFPRNMPIYK 


648 


1998 


A 


4867 


2030 


837 


AGMLPAVGSADEEEDPAEEDCPELVPMETTQ 
SEEEEKSGLGAKIPVTHTGYLGAGKTTLLNYI 
LTEQHSKRVAVILNEFGEGSALEKSLAVSQG 
GELYEEWLELRNGCLCCSVKDNGLRAIENLM 

rUCrvHlfFnYTT T FTTfil ADPfJ AVA SMPWVnA 

ELGSDIYLDGIITIVDSKYGLKHLAEEKPDGLI 
NEATRQVALADAILINKTDLVPEEDVKKLRT 
TIRSINGLGQILETQRSRVDLSNVLDLHAFDSL 

NAJCEEHLNMFIQNLLWEKNVRNKDNHCMEV 
IRLKGLVSIKDKSQQVIVQGVHELYDLEETPV 
SWKDDTERTNRLVLLGRNLDKD1LKQLFIAT 
VTETEKQWTTHFKEDQVCT 


649 


1999 


A 






IRQ 


DO VST 1 T PKT nVOWAOYWAHWOPPI PfTFKR 
FSCLSLRSS WD* KC APPHPAFVFL VEMGFHRV 
GQAGLELRTSGDPPASASQSAGITGVSHLA*P 
TSMPLLPFQRLCVYI 


650 


2000 


A 


4874 


2 


437 


FFFLRRSFAF V AQAG VQWCDLGSPQPLPPG F 
K*FSCLSLPSSWDYRHAPPPCPS*FLYF**RQG 
FTMLARLVLNS*PHDLPTSPSQSAEIKGVSHR 
CPASFYLFLKYYLEAKFCA* GECAPSAG VG A 
GYKRGHKSCLLINCWQI 


651 


2001 


A 


4898 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATR 

AALKLPLLVSKLLDIPGLEVAWTTERAKHFY 

SPQDIPVTLYSDADEWEMWKSRSDPVLHIDL 

RRWADLLLVAPLDANTLGKVASGICDNLLTC 

VMRAWDRSKPLLFCPAMNTAMWEHPITAQQ 

VDQLKAFGYVEIPCVAKKLVCGDEGLGAMA 

EVGTIVDKVKEVLFQHSGFQQS*PGISVMGVP 

1 YSFWVOAKSVKMDVGKTGGYPH1 T NOGPA 

LSLPRGQACSRLNWTEGPGLSFFQPGEAAA 


652 


2002 


A 


4927 


1 


611 


FRGRQTSRPARGFSPWRPPGTMQEPSSGECPA 

SP*LPCASNRLAFGGLIFPCAPLVPYPAPFSPLL 

PAFSCAPRPRAHTHSRTHPSAPLVPKPSSRAR 

GQSPIPSRASSPSCSWAQVPGVALARCAGVC 

KPGDS WRVAACI SGRCCSRGRRRG SGPRNPE 

QSFRGAWGPSFWGSWKSQRELSAGGAQAWP 

LLGSAGSGLRGEA 


653 


2003 


A 


4965 


2 


283 


FFFH*DGVSLCHPGWNAVARSWLTATSASR 
VOAVSCFRLPSSWDYRHATMPG*FF*YF**R 
WGFTrLAILVLNS*PQVlCPPWPPKVXTLQA 


654 


2004 


A 


4968 


3 


437 


RPGIPGRRFRR S WFCOLP* EPEPGLESL ATPGD 
IPAVGLGAXGVIPPVRVPQRPPTQRSQGRGW 
DPERDPGCRVQVSRGPRFGEQKTPG LQGCLP 
PPCLTHLAAASCWVWCGRWKRDSAECQCD 
HSCSAVSQQEDRCRSSSCS 


655 


2005 


A 


4983 


201 


397 


MNNNTTCIQPSMISSMALPIIYILLCIVGVFGN 
TLSQWIFLTKIGKKTSTHIYLSHLVTANLLVC 


656 


2006 


A 


4988 


332 


159 


LVHKDMYREFFEEEAQASNKHVTRCLTSLVI 
REVHIKTMR*HFLPIRLEKNKNNIKD 


657 


2007 


B 


5008 


129 


465 


MAGMKTASGDYIDSSWELRVFVGEEDPEAES 
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VTLRVTGESHIGGVLLKIVEQINRKQDWSDH 

ATWWFHlirPnWl I OTT4WTI rYkTVYITT AHA t>l V 

FGPQHRPVILRLPNRRALRLX* 


658 


2008 


A 


: 5017 


1 


292 


FFFFKETESHS VTQAG VQWHDLGSLQPPPPGF 
KRFSCLSLLSSWDYRCAPPHPANFVFLVETGF 

UUVAHAnT VT T TT *QAT\IT ffi QTQT PTPT "PIT T « 


659 


2009 


A 


5018 

; 


17 


338 


RGHGGKSLTGGTPGNWGDGLLVSEDWSHLIF 
T*NSLVSPVLGKWSPCLQGPGLSAVHTWPWL 
MAACWAVHVKTHMRPGUWLPRLVLNSWS 

* A TIT T WPDV A f C T C\ A 


660 


2010 


A 


5028 


2 


310 


SRVDDFVGERRGGCDECLCGHRGLRAVPLG 
HPGHLCLQPPGGPA*FLDYCRGCCPHPVPGST 
AGSCPRQKKTTPGPTVLCVCSFWIYQRGEPH 
HRTGARWNH 


661 


2011 


A 


5050 


752 


431 


RQSCSSTQAKVQWFHYGPLQSQPPGLKQSSQ 
LSLPNSRDHRHVPPRLAIFSFAETGSPYFAQAS 
LELLGSSHPPTSASQSARITGVSHRAWPLK*F 
NLNQYQTLTMN 


662 


2012 


A 


5054 


48 


103 


ELNNGPFQMPLCNGGNLAVTGSWADRSPLH 

EAASQGRLLALRTLLSQGYNVNAVTLDHVTP 

LHEACLGDHVACARTLLEAGANVNAITIDGV 

TPLFNACSQGSPSCAELLLEYGAQAQLESCLP 

SPTHEGASKGHHECLDILISWGIDVDQEIPHSG 

TPLWACMAQQFHCIWNLIYAGAGVRKGKY 

WDTPLPGAGHQSTQKLE*LFAMVEIWQ 


663 


2013 


A 


5066 


951 


580 


VRNS*SFAHCASVYKHHYMDGQTPCLFVSSK 
ADLPEGVAVSGPSPAEFCRKHRLPAPVPFSCA 
GPAEPSTTIFTQLATMAAFPHLVHAELHPSSF 
WLRGLLGWGAAVAAVLSFSLYRVLVKSQ 


664 


2014 


A 


5071 


550 


1 


LSFIEVLSMEQVNKTWREFWLGFSSLARLQ 

QLLFVIFLLLYLFTLGTNAIIISTIVLDRALHTP 

MYFFLAILSCSEICYTFVIVPKMLVDLLSQKK 

TISFLGCAIQMFSFLFFGSSHSFLLAAMGYDR 

YMAICNPLRYSVLMGHGVCMGLMAAAWAC 

GFTV SL VTTSL VFHLPFH S SNQHE 


665 


2015 


A 


5074 


496 


692 


QQYHNTGSAGHHAHCQVGHSPHVHYPSGCG 
PL*IQRGLPSFNSLEGHSLKDSGHEESVQLDSE 
HDVQRSLYCDTAVNDVLNTSVTSMGSQMPD 
HlA^NtOr HUKJltUKtijOrlalJKC WlvLrKNrMFl 

RSKSPEHVRNIIALSIEATAADVEAYDDCGPT 
KRTFATFGKDVSDHPAEERPTLKGKRTVDVT 
ICSPKVNSVIREAGNGCEAISPVTSPLHLKSSL 
PTKPSVSYEIVDPGITARRC 


666 


2016 


A 


5080 


408 


248 


IMLLSTSS*VYFQSSTKDSHFFLFDFQKTGPPL 
VGPKAQLSGLQLQPCLYKJUl 


DO / 


ZU1 / 


A 


<f\Q\ 

.>Uai 




24/ 


UL 1 N orir r LFUryKTOPPLUuPKAQr c> S LQLQ 
PCVY*RR 


668 


2018 


A 


5086 


852 


233 


NIKSNDRWVQ0CTAYKYFF*KNGDNYNWVF 
RALPTTFADIENLKYLLFTRDASQPFYLGHTV 
lFGDLEYVTVEGGwLSRELMKRLNRLLDNSE 

TPAnO^VlWk'l QFnk' flT A in VVAnVHATTKIA 

EDYEGRDVFNTKPIAQLIEEALSNNPQQVVEG 
CCSDMAITFNGLTPQKMEVMMYGLYRLRAF 
GHYFNDTLVFLPPVGSEND 


669 


2019 


A 


5101 


1 


329 


PGRPTRPPLLTLLAH V SPEPAGPSCDSLAQPG 
ASGV*VQHDSHPPLLCGSQCLSEPVPGSHGPP 
RGCQHEAAPCPRGPG SDGLHHAS AAC ASLPP 
SPILPVLLPELGPL 


670 


2020 


A 


5102 


3 


547 


DAWGNRCAVGAAPRLIHLHLCCTPADPSRKP 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met | 

tit 

hod | 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
oi pepuoe 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, 
I>=Aspartic Acid, E==GIutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M=Methionine, N^Asparagine, P=Proline, 
Q^lutamine, R=Arginine, S=Serine, 
i— inreonuic. v — vannc, w — irypiopjmii, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucjcuiiuc uiacruun 














DEL*NMNGRVDYLV1'EEE1NLTRGPSGLGFNI 
VGGTDQQYV SNDSGIYVSRIKENG AAALDGR 

GYAVSLRVQHRLQVQNGPIGHRGEGDPSGIPI 

PIVfVI VPVFAT TMVAA WAFJV/TR VT?001 
r ivi vLvrv r/iL i ivi v /va w s\r ivu\. i i\v < »* S) /l, 


671 


2021 


A 


5105 


672 


400 


RDGREELCLQQEPTLPSRJCSSAPLLYFLFICPF 
VLLLLLLISLLCLYWKARKLSTLRSNTRKEKA 
LWVDLKEAGGVTTNRMED*EEDECN 


672 


2022 


A 


5148 


72 


314 


IIYFSYNIFLKITELLNDVERLKQALNGLSQLT 
YTSGNPTKRQSQLIDTLQHQVKSLEQQLAVS 
NQAHGALQEYVLAPCS 


673 


2023 


A 


5152 


210 


335 


REILCSRIGRLNIV*MSLFPNLTCRLNAIPIK1PA 
NHFVEVT 


674 


2024 


A 


5153 


3 


2953 


LTEDQPFDILQKSLQEANITEQTLAEEAYLDA 

SIGSSQQFAQAQLHPSSSASFTQASNVSNYSG 

QTLQPIGVTHVPVGASFASNTVGVQHGFMQH 

VGISVPSQHLSNSSQISGSGQIQLIGSFGNHPS 

MMTINNLDGSQIILKGSGQQAPSNVSGGLLV 

HRQTPNGNSLFGNSSSSPVAQPVTVPFNSTNF 

QTSLPVHN1HQRGLAPNSNKVPIN1QPKPIQM 

GQQNTYNVNNLGIQQHHVQQGISFASASSPQ 

GSVVGPHMSVNIVNQQNTRKPVTSQAVSSTG 

GSIVIHSPMGQPHAPQSQFLIPTSLSVSSNSVH 

HVQTINGQLLQTQPSQLISGQVASEHVMLNR 

NSSNMLRTNQPYTGPMLNNQNTAVHLVSGQ 

TFAASGSPVIANHASPQLVGGQMPLQQASPT 

VLHLSPGQSSVSQGRPGFATMPSVTSMSGPSR 

FPA VS SASTAHPSLGSAVQSGSSGSNFTGDQL 

TQPNRTPVPVSVSHRLPVSSSKSTSTFSNTPGT 

GTQQQFFCQAQKKCLNQTSPISAPKTTDGLR 

QAQIPGLLSTTLPGQDSGSKVISASLGTAQPQ 

QEKVVGSSPGHPAVQVESHSGGQKRPAAKQ 

LTKGAFILQQLQRDQAHTVTPDKSHFRSLSD 

AVQRLLSYHVCQQSMPTEEDLRKVDNEFETV 

ATQLLKRTQAMLNKYRCLLLEDAMRINPPAE 

MVMIDRMFNQEERASLSRDKRLALVDPEGFQ 

ADFCCSFKl-DKAAHETQFGRSIX^HGSKJVSSS 

LQPPAKAQGRDRAKTGVTEPMNHDQFHLVP 

NHIWSAEGNISKKTECLGRALKFDKVGLVQ 

YQSTSEEKASRREPLKASQCSPGPEGHRKTSS 

RSDHGTESKLSSILADSHLEMTCNNSFQDKSL 

RNSPKNEVLHTDIMKGSGEPQPDLQLTKSLET 

TFKNILELKKAGRQPQSDPTVSGSVELDFPNF 

SPMASQENCLEKFIPDHSEGWETDSILEAAV 

NSILEC 


675 


2025 


A 


5154 


599 


1880 


LKKMEPFSCDTFVALPPATVDNRIIFGKNSDR 

LYDEVQEWYFPAVVHDNLGERLKCTY1EID 

QVPETYAWLSRPAWLWGAEMGANEHGVCI 

GNEAVWGREEVCDEEALLGMDLVRLGLERA 

DTAEKALNVIVDLLEKYGQGGNCTEGRMVF 

OVUXlOUI I A r\15"KTT? A \17TI TTTA PWU/ A A W \ / (~YC 

o Y Hrs or Ll AjJKJNelA W lL,n, lAvjlv i VV/V>\xLrv V l^.t, 

GVRNISNQLSITTKIAREHPDMRNYAKRKGW 

WDGKKEFDFAAAYSYLDTAKMMTSSGRYCE 

GYKLLNKHKGN1TFETMMEILRDKPSGINME 

GEFLTTASMVFILPQDSSLPCIHFFTGTPDPER 

SVFKPFIFVPHISQLLDTSSPTFELEDLVKKKS 

HFKTDRRHPLYQKHQQALEVVNNNEEKAKI 

MLDNMRKLEKELFREMESILQNKHLDVEK1V 

NLFPQCTKDEIQIYQSNLSVKVSS 



224 



WO 01/57188 



PCT/U SO 1/03800 



SEO ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEO ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SFO 
ID NO: 
in 

USSN 
09/496 
914 


1 I CUlv>LCU 

beginning 
nucleotide 
location 
corresnonrfi 

ng to first 
amino acid 
residue of 
peptide 
sequence 


x rcuicieu cnu 
nucleotide 
location 
corresponding 

to laot nminfi 

acid residue 
of peptide 
sequence 


Amino acia sequence (A— Alanine c— cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G-GIycine, H^Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 

fVyf— A/lp^Hirvnt nf "W=Aer»«rsicMnp P — Pml in<» 
iv* — iviciiiJLuiiiic, /updiaginc, r— rrui mc, 

Q=Glutamine, R=Arginine, S=Serine, 

T=Threonine, V= Valine, W=Tryptophan, 

YKTyrosine, X=Unknown, *=Stop codon, 

/=possible nucleotide deletion, Y=possible 

nucleotide insertion 


676 


2026 


A 


5155 


2 


306 


FFFLRRSLALSPRPDCGLQWRNLGSLQAPPPG 
FTPF°»CI SI PSSWDYRRPPPRPAlsJFI Vfc'**PPn 

FTLLARMV SI S*PHDPPASASQSAGITGVSHRA 
RPT 


677 


2027 


A 


5167 


97 


740 


FFHSVDLLALEQSKTFYKPDWFDIVESEVKCC 

SNDLDVP VGHI VHTGMLNEGGGYEN DCSIAR 
LNKRSFFMISPTDC^VHCWAWLKKHMPKDS 
NLLLEDVTWKYTALNLIGPRAVDVLSELSYA 
PMTPDHFPSLFCKEMSVGYANGIRVMSMTHT 

rSFPnPMI VIPTPVPVl/nFTTVrfl QTT VQXTQ 
\Jxjr\jrivlL, J lric, j i\ VV ur 1 1V1L*oI1j VoINo 


678 


2028 


A 


5183 


1919 


2018 


PALCRLRDDMTVCVADFGLSKKIYSGDYYRQ 
GPJAKMPVKWIAIESLADRVYTSKSDVWAFG 
VTMWEIATRGMTPYPGVQNHEMYDYLLHG 
HRLKQPEDCLDELCKI**SPQSP 


679 


2029 


A 








K^PPENIIDGNPETFWTTTGMFPQEFIICFHKH 
VRIERLVIQSYFVQTLKIEKSTSKEPVDFEQWI 

Cl/ni X/TJTTT/^f^T n\11?CI\7 A UT"V , 0 a T\n r> mi \ /n 

liJvJ^LVlllliU^J^^NtmVAliJUOaAl YLRrllVS 
AFDHFASVHSVSAEGTVVSNLSS 


680 


2030 


A 


5204 


541 


92 


EILAVLKLACGDISLNALALMVATAVLTLAPL 
LUCLo Y IJjoAILK Vr^aAAUKCKAr o I Co AH 

RTVWVFYGTISFMYFKPKAKDPNVDKTVAL 
FYGVVTPSLNPIIYSLRNAEVKAAVLTLLRGG 
LLSRKASHCY CCPLPLSAGIG 


681 


2031 


A 


5207 


10 


it / 


vrL/IMOU V IKJLrVCol L,VfcJil oLI VoEAMiiQSI 
KNESPLPGTLAHTCNTSTLGGRGRWIT*GREF 
DTSMANMVKPCLYRK 


682 


2032 


A 


5210 


2 


231 


T7"Cp"p'TT7 C VQTTO A f1\/0\l/T>XTT COT IfTI DDPT1/ *T? 

rrrni Jtio I oil VfAuvy WrfslLooJLJS.1 LrrurK.*r 
SCLSLPSSWDYRCLPPCPANFCIFSRNGVLPC 

WPTiWIUTPni <5 
w rvj w oiv i rULtO 


683 


2033 


A 


5218 


85 


402 


CPSVSGLIKSDLRRHNINIGITNVDVKAVSNIF 
MIILXRSMYRINVKPYFFI*LFFSRVNC*SVIIG 
YARCYTFLIF*LFL* IPADSPTDQEPKTVMLSK 
QSESAI 


684 


2034 


A 


5220 


1 


194 


IN IjlVlJvDJVl^iN L.JN o JUN FlTv 1 Wet I JvU 1 Js.* 1Mb l r 

YG+ALNVIKMAVLPKLMYRPSATLVKIPQHL 

TDS 


685 


2035 


A 


5228 


260 


440 


LHSQDGNSDPRKPQGEMSAHAFPVQTCGEED 
QKKTPQVPINFTELSKCS*S*KIMSGERE 


686 


2036 


A 


5239 


79 


SOS 


APSEAALEKKXSELSNSQQSVQTLSLWLIHHR 
KHSRPIVTVWEREUtKAKPNPJCLTFLYLAND 
VIQNSKRKGPEFTXDFAPVIVEAFKHVSSETD 
ESCKKHLGRVLSIWEERS 


687 


2037 


A 


5244 


1 


428 


MAAVVAATALKGRGARNARVLRGILAGATA 
NKASHNRTRALQSHSSPEGKEEPEPLSPELEYI 
PRKRGKNPMKAVGLAWAIGFPCGILLFILTKR 
EVDKDRVKQMKARQNMRLSNTGEYESQRFR 
A SSQSAPSPDVGSGVQT 


688 


2038 


A 


5249 


1 


1407 


LQQTEDKSLLNQGSSSEEVAGSSQKMGQPGP 

SGDSDLATALHPXSLRRQNYLSEKQFFAEEW 

QRKIQVLADQKEGVSGCVTPTESLASLCTTQS 

EITDLSSASCLRGFMPEKLQIVKPLEGSQTLY 

HWQQLAQPNLGTILDPRPGVITKGFTQLPGD 

AIYHISDLEEDEEEGITFQVQQPLEVEEKLSTS 

KPVTGIFLPPITSAGGPVTVATANPGKCLSCT 

NSTFTFTTCRILHPSDITQVTPSSGFPSLSCGSS 

GSSSSNTAWSPALAYRLS1GESITNRRDSTTT 
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SEQ ID 
NO; of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


I Met 
hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amnio 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 

P)— A on^r-tic AciH F=fi!ntamir AHH 

F=PhenyIaIanine, GKilycine, H=Histidine, 
Msoleucine, KHLysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, 
Y^Tvrosine. X— Unknown *— Ston codon 
/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 














FSSTMSLAKLLQERG1SAKVYHSPISENPLQPL " 

PKSLAIPSTPPNSPSHSPCPSPLPFEPRVHLSEN 

FLASRPAETFLQEMYGLRPSRNPPDVGQLKM 

NLVDRLKRLGIARWKNPGAQENGRCQEAEI 

GPQKPDSAVYLNSGSSLLGGLRRNQSLPVIM 

G SFAAP VCTS SPKMG VLKED 


689 


2039 


A 


5254 


2 


2621 


LSLFGSRALGRSGARAMAKAKXVGARRKAS 

GAPAGARGGPAKANSNPFEVKVNRQKFQILG 

RKTRHDVGLPGVSRARALRKRTQTLLKEYKE 

RDKSNVFRDKRFGEYNSNMSPEEKMMKRFA 

LEQQRHHEKKS1YNLNEDEELTHYGQSLADIE 

KHNDIVDSDSDAEDRGTLSGELTAAHFGGGG 

GLLHKKTQQEGEEREKPKSRKELIEELIAKSK 

QEKRERQAQREDAL ELTEKLDQD WKEIQTLL 

SHKTPKSENRDKKEKPKPDAYDMMVRELGF 

EMKAQPSNRMKTEAELAKEEQEHLRKLEAE 

RLRRMLGKDEDENVKKPKHMSADDLNDGFV 

LDKDDRRLLSYKDGKMNVEEDVQEEQSKEA 

SDPESNEEEGDSSGGEDTEESDSPDSHLDLES 

NVESEEENEKPAKEQRQTPGKGLISGKERAG 

KATRDELPYTFAAPESYEELRSLLLGRSMEEQ 

LLVVERIQKCNHPSLAEGNKAKLEKLFGFLLE 

Y VGDLATDDPPDLTVI DKLVVHLYHLCQMFP 

ESASDAIKFVLRDAMHEMEEMIETKGRAALP 

GLDVLIYLKITGLLFPTSDFWHPVVTPALVCL 

QRFIPELINFLLGILYIATPNKASQGSTLVHPFR 

ALGKNSELLVVSAREDVATWQQSSLSLRWA 

SRLRAPTSTEANHIRLSCLAVGLALLKRCVLM 

YOST P^FHAIMHPT P AT T TTVHT Anf^HDAPT f\ 

ELCQSTLTEMESQKQLCRPLTCEKSKPVPLKL 
FTPRLVKVLEFGRKOCiSSKFFOFRKlH UAVVUC 

REFKGAVREIRKDNQFLARMQLSEIMERDAE 
RKRKVKQLFNSLATQEGEWKALKRKKFKK 


690 


2040 


A 


5261 


1 


304 


FFFFVFLVETGFHHVGQAGLELLTSGDPPTW 
ASQSAGITGVSHCSWPVIYVLSTLLHAVRNVL 
FKRTFPLKSSSFLSYDKE3FPILIVLKFYLVTLT 
SFVK 


691 


2041 


A 


5270 


3 


158 


NCHTTHCTANWVHLPGTPPGWKJDGPAAAL 
EVLSSFFFFFLKFSYKPQNIV 


692 


2042 


A 


5282 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEW 

ERVLTFLPAKALLRVACVCRLWRECVRRVLR 

THRSVTW1SAGLAEAGHLEGHCLVRWAEEL 

ENW1LPHTVLYMADSETFISLEECRGHKRAR 
KRTSMFTA1 AT FKT FPVOPOVl ni\/ r TPr;r\/\rr 

PMGSGSNRPQEIEIGESGFALLFPQIEGIKIQPF 
HFIKDPKNLTI ERHOT TFVHT T TWPFT PVVI V 

FG YNCCKVG A SNYLQQ WSTFSDMNHLAGG 

QVDNLSSLTSEKNPLDIDASGVVGLSFSGHRI 
OSATVLLNEDVSDFKTAFAAMORT KA ATMrPF 

IINTIGFMFACVGRGFQYYRAKGNVEADAFR 
KFFPSVPLFGFFGNGEIGCDRIVTGNFILRKCN 
EVKDDDLFHSYTT1MALIHLGSSK 


693 


2043 


A 


5301 


362 


507 


EEIKERFGPGLVIYWYGFIQELDCNRERGILLK 
ACFPTNIVTLCHSIA 


694 


2044 


A 


5310 


1 


204 


Rva,TAINHTLKENLRKFYKGKKDKPLDLRPK 

KTRAMRRRLNMHEENUCTKXQHRKERLYPL 
RKYAAKA 


695 


2045 


A 


5315 


125 


1596 


ETOSTAVXSEVQVCISLLLCLEDRTMPKKAKP 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
nod 


SEQ 

TT"V \T/\ 

ID NO: 
in 

USSN 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alaaine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, GKjlycine, H=Histidine, 
I^lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N^Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X*=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














TGSGKEEGPAPCKQMKLEAAGGPSALNFDSP 

SSLFESLI SPIKTETFFKEFWEQKPLL IQRDDPA 

LATYYGSLFKLTDLKSLCSRGMYYGRDVNV 

CRCVNGKKKVLNKDGKAHFLQLRKDFDQKR 

ATIQFHQPQRFKDELWRIQEKLECYFGSLVGS 

NVYITPAGSQGLPPHYDDVEVFILQLEGEKH 

WRLYHPTVPLAREYSVEAEERJGRPVHEFML 

KPGDLL YFPRGTIHQ ADTPAGL AH STH VTI ST 

YQNNSWGDFLLDTISGLVFDTAKEDVELRTG 

IPRQLLLQVESTTVATRRLSGFLRTLADRLEG 

T1CELLSSDMKKDFIMHRLPPYSAGDGAEL STP 

G GKLPRLDSV VRLQFKDHTVLT VLPDQDQSD 

ETQEKMVYIYHSLKNSRETHMMGNEEETEFH 

GLRFPLSHLDALKQIWNSPAISVKDLKLTTDE 

EKESLVLSLWTECLIQW 


696 


2046 


A 


5318 


1476 


742 


LMKXYLEAAELGEISDIHTKLLRLSSSQGTIET 

SLQDIDSRLSPGGSLADAWAHQEGTHPKDRN 

VEKLQVLLNCMTEIYYQFKKDKAERRJLAYN 

EEQIHKFDKQKLYYHATKAMTHFTDECVKK 

YEAFLNKSEEWIRKMLHLRKQLLSLTNQCFDI 

EEEVSKYQEYTNELQETLPQKN1FTASSGIKHT 

MTPIYPSSNTLVEMTLGMKKLKEEMEGVVKE 

LAENNHILESGGSLTMDGGLKNVDCL 


697 


2047 


A 


5320 


244 


478 


LDYNFFLFEMTFGLVSQAGVQWHDLGSLQPP 

PPGFKQFSCLSLPSSWDYRHLPPHLANFSREG 

VSPSWPGWSRTPDFR 


698 


2048 


A 


5324 


266 


714 


LPIRKSLRSVRSGFPTSQSPITRNLDGTASGSC 

LAKTVTGSLFRINVGLRGLVAGGIIGALLGTP 

VGGLLMAFQKYSGETVQERKQKDRKALHEL 

KLEEWKGRLQVTEHLPEKIESSLQEDEPENDA 

KKIEALLNLPRNPSVIDKQDKD 


699 


2049 


A 


5334 


699 


277 


RPHGHLVCISSSAGLSGVNGLADYCASKFAA 
FGFAESVFVETFVQKQKGIKTTIVCPFFIKTGM 
FEGCTTGCPSLLPILEPKYAVEKIVEAILQEKM 
YLYMPKLLYFMMFLKSFLPLKTGLLIADYLG I 
LHAMDGFADQKK 


700 


2050 


A 


5344 


3 


614 


PTAEEMSSLTPES SPELAKRS WFGNFISLDKEE 

QIFLVLKDKPLSSIKADIVHAFLSIPSLSHSVLS 

QTSFRAEYKASGGPSVFQKPVRFQVDISSSEG 

PEPSPRRDGSGGGGIYSVTFTLISGPSRRFKRV 

VETIQAQLLSTHDQPSVQALADEKNGAQTRP 

AGAPPRSLQPPPGRPDPELSSSPRRGPPKDKK 

LLATNGTPL 


701 


2051 


A 


5346 


3 


1383 


HASVLFCRVMAASKTQGAVARMQEDRDGSC 

STVGGVGYGDSKDCILEPLSLPESPGGTTTLE 

GSPSVPCIFCEEHFPVAEQDKLLKHMIIEHKIV 

IADVKLVADFQRYILYWRKRFTEQPITDFCSV 

IRINSTAPFEEQENYFLLCDVLPEDR1LREELQ 

KQRLREILEQQQQERNDTNFHGVCMFCNEEF 

LGNRSVILNHMAREHAFNIGLPDNIVNCNEFL 

C 1 Ll^JKJsJLUfNLyL-L. Y CfcK. I rKUKN 1 LKDHMR 

KXQHRKINPKNRE YDRFYVIN YLELGKS WEE 

VQLEDDRELLDHQEDDWSDWEEHPASAVCL 

FCEKQAET1EKL YVHMEDAHEFDLLKIKSEL G 

LNFYQQVKLVNFIRRQVHQCRCYGCHVKFKS 

KADLRTHMEETKHTSLLPDRKTWDQLEYYFP 

TYENDTLLWTLSDSESDLTAQEQNENVPIISE 

DTSKLYALKQSSILNQLLL 


702 


2052 


A 


5356 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWD 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine C-Cysteine, 
D=Aspartic Acid s E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I-lsoleucine, K=Lysine. L=Leucine, 
M=Methionine, N=Asparagine, P=Proiine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonme, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 


..... 












LASLRCTLGAFCECDFRPDLPGLECDLAQHL 

AGQHLAJCALWKAlJCAFVRJjPAPT^^ 

HGWTGTGKSYVSSLLAHYLFQGGLRSPRVH 

HFSPVLHFPHPSHIERYKKDLKSWVQGNLTA 

CGRSLFLFDEMDKMPPGLMEVLRPFLGSSWV 

VYGTNYRKAIFIFISNTGGEQINQVALEAWRS 

RRDREEILLQELEPV ISRA VLD NF HHO r i> N bUl 

MEERLLDAWPFLPLQRHHVRHCVLNELAQL 

GLEPRDEWQAVLDSTTFFPEDEQLFSSNGCK 

TVASR1AFFL 


703 


2053 


A 


5380 


278 


657 


LFLQKLRMKTEEEARTHTEIEMFLRKEQQKL 
EERLEFWMEKYDKDTEMKQNELNALKATKA 
SDLAHLQDLAKMIREYEQVIIEDRIEKERSKK 
KVKQDLLELKSVIKLQAWWRGTMIRREIGGF 

KM 


704 


2054 


A 


5381 


1 


1003 


FRGRAVKMAAVVEVEVGGGAAGERELDEV 

DM SDLSPEEQ WRVEHARMHAKHRGHEAMH 

AEMVLILIATLWAQLLLVQWKQRHPRSYN 

MVTLFQMWVVPLYFTVKLHWWRFLVIWILF 

SAVTAFVTFRATRKPLVQTTPRLVYKWFLL1Y 

KISYATGIVGYMAVMFTLFGLNLLFKIKPEDA 

MDFGISLLFYGLYYGVLERDFAEMCADYMA 

STIGFY SESGMPTKHL SDSVCA VCGQQ1 FVD V 
SEEGUENTYRLSCNHVFHEFC1RGWCIVGKK 
QTCPYCKEKVDLKRMFSNPWERPHVMYGQL 
LDWLRYLVAWQPVIIGVVQGINY1LGLE 


705 


2055 


A 


5396 


3 


675 

• 


IYDRDPLQLATRAGQPLDINMAGEPKPYRPKP 

GNKRPL SALYRLESKEPFLS VGGY VFDYDYY 

RDDFYNRLFDYHGRVPPPPRAVIPLKRPRVA 

VTTTRRGKGVFSMKGGSRSTASGSTGSKLKS 

DELQT1KKELTQ1KTKID SVLGRLDK IEKQQK 

AEAEAQKKLLEESLVLIQEECVSEIADHSTEEP 

AEGGPDADGEEMTDGIEEAFDEDGGHELFLQ 

IK 


706 


2056 


A 


5410 


2 


98 


GRVGLNLEGRGCSEPKWRHCTPTWATEQDSI 
S 


707 


2057 


A 


5415 


6 


287 


PFKLTPSFLSHAFSSGQERKVFIELNHIKKCNT 
VRG VF VLEEFGN YTILLLGLDS HG SNSNLG AP 
EEGLGAGRKRTSVEKSGGAGVTRKKRDP 


708 


2058 


A 


5423 


3 


291 


SSSNPLGSPSTLWKLCSFVLHNKSCCCSFFGS 

TPTIJlAnXTVRVCGFIPEVSKTTNPLGRTNNS 

GCTIFKTVTLTARSTASLLKSVRPRTHQKE 


709 


2059 


A 


5424 


679 


347 


RIRHEEKRGSRGRGRRTSEEDTPKKKKHKGG 
SEFTDTILS VHPSD VLDMP V DPNEPT Y CLCHQ 
VSYGEMIGCDNPDCPIEWFHFACVDLTTKPK 
GKWFCPRCVQEKRKKK 


710 


2060 


A 


5442 


1073 


559 


QESLKKK1QPKLSLTLSSSVSRGNVSTPPRHSS 
GSLTPPVTPP1TPSSSFRSSTPTGSEYDEEEVDY 
EESDSDESW'riESAISSEAILSSMCMNGGEEK 
PFACPVPGCKKRYKNVNGIKYHAKNGHRTQI 

SAEIIRKMQQ 


711 


2061 


A 


5449 


1 


319 


GDSLCVPQYNKYREERVILFLKMASGHAFQP 
DLVKRIRDAIRMGLSARHVPSLILETKGIPYTL 
NGKKVEVAVKQ11AGKAVEQGGAFSNPETLD 
LYRDIPELQGF 


712 


2062 


A 


5499 


91 


749 * 


RPTPGHGDFWMQPLTKDAGMSLSSVTLASAL 
QVRGEALSEEEIWSLLFLAAEQLLEDLKNDSS 
DYVVCPWSALLSAAGSLSFQGRVSHIEAAPF 
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Amino acid sequence (A^ Alanine OCysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F=Phenylalanine, G=Glycine, H-Histidine, 
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^possible nucleotide deletion, V=possible 
nucleotide insertion 














KAPELLQGQSEDEQPDASQMHVYSLGMTLY 
WSAGFHVPPHQPLQLCEPLHSILLTMCEDQPH 
RRCTLQSVLEACRVHEKEVSVYPAPAGLHIR 
RLVGLVLGTISEVSREPCFSSSSCWSCVAIKI 


713 


2063 


A 


5506 


22 


478 


VEELILVSRLDPHLHTPMYFFLAHLSFLDLSFT 

TSSIPQLLYNLNGCDKTISYMGCAIQLFLFLGL 

GGVECLLLAVMAYDRCVAICKPLHYMVIMN 

PRLCRGLVSVTWGCGVANSLAMSPVTLRLPR 

CGHHEVDHFLCEMPALIRMAC1STV 


714 


2064 


A 


5514 


25 


220 


AIRPYWCENNIIGIGICLSTADGKAFADPEVLR 
RLTSSVSCALDEAAAALTRMRAESTANAGQS 
DK 


715 


2065 


A 


5526 


3 


810 


KVTAPRRPQRYSSGH GSDNSSVLSGELPP AM 

GRTALFHHSGGSSGYESLRRDSEATGSASSAP 

DSMSESGAASPGARTRSLKSPKKRATGLQRR 

RLIPAPLPDTTALGRKPSLPGQWVDLPPPLAG 

SLKEPFEIKVYEIDDVERLQRPRPTPREAPTQG 

L AC V STRLRL AERRQ QRLRE V Q AKHKHL CEE 

LAETQGRLMLEPGRWLEQFEVDPELEPESAE 

YLAALERATAALEQCVNLCKAHVMMVTCFD 

ISVAASAAIPGPQEVDV 


716 


2066 


A 


5529 


458 


790 


SPGYGENKFTVTSXNIAVPLCEMNK1YSYYSD 
SSSSERTMDLVLEMCNTNSIHWCGISGRQLG 
KLHPSSSLCLALTLLSSVQGLQSISGLRLTDTF 
LKRTYEYDDIAQVCV 


717 


2067 


A 


5531 


3 


460 


NSEDLLKYFNPESWQEDLDNMYLDTPRYRG 

RSYHDRKSKVDLDRLNDDAKRYSCTPRNYS 

VNIREELKLANWFFPRCLLVQRCGGNCGCG 

TVNWRSCTCNSGKTVKKYHEVLQFEPGHIKR 

RGRAKTMALVDIQLDHHERCDCICSSRPPR 


718 


2068 


A 


5586 


311 


88 


AVLKNMAPMTALGlXDLfflLNLlLFLSAGEDF 

TSWSEIMMYILLVFLTLWLLIEMIYCYRKVS 

KAEEAAQENA 


719 


2069 


A 


5598 


1 


330 


KNCANEAWQKILDRVLSRYDVRLRPNFGSM 
LATNSTRGLNEDELMAHGQEKDSSSESEDSC 
PPSPGCSFTEGFSFDLLNPDYVPKVDKWSRFL 
FPLAFGLFN1VAAERC 


720 


2070 


A 


5628 


798 


148 


LPPAQIPEAWLLLANWWLILVPLKDRLIDP 

LLLRCKLLPSALQKMALGMFFGFTSVIVAGV 

LEMERLHYIHHNETVSQQIGEVLYNAAPLSIW 

WQIPQYLLIGISEIFASIPGLEFAYSEAPRSMQG 

AIMG1FFCLSGVGSLLGSSLVALLSLPGGWLH 

CPKDFGNINNCRMDLYFFLLAGIQAVTALLF 

VWIAGRYERASQGPASHSRFSRDRG 


721 


2071 


A 


5632 


146 


536 


MSALIVRKLRSAELTLFSELPTVLGANVNAA 
KLHETALHHAAKVKNVDLIEML1EFGGNIYA 
RDN R G KKP S D Y T W S S S APAKCFE Y YEKTPLT 
LSQLCRVNLRKATGVRGLEK1AKLNIPPRLID 
YLSYN 


722 


2072 


A 


5638 


3 


3806 


CPSLDIRSEVAELRQLENCSWEGHLQILLMF 

TATGEDFRGLSFPRLTQVTDYLLLFRVYGLES 

LRDLFPNLAVIRGTRLFLGYALVIFEMPHLRD 

VALPALGAVLRGAVRVEKNQELCHLSTIDW 

GLLQPAPGANHIVGNKLGEECADVCPGVLGA 

AGEPCAKTTFSGHTDYRCWTSSHCQRVCPCP 

HGMACTARGECCHTECLGG CSQPEDPRACV 

ACRHLYFQGACLWACPPGTY QYES WRCVTA 

ERCASLHSVPGRASTFGIHQGSCLAQCPSGFT 

RNSSS1FCHKCEGLCPKECKVGTKTIDS1QAA 
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/=possible nucleotide deletion, \=possible 
nucleotide insertion 














QDLVGCTHVEGSL1LNLRQGYNLEPQLQHSL 

GLVETITGFLKIKHSFALVSLGFFKNLKLIRGD 

AMVDGNYTLYVLDNQNLQQLGSWVAAGLTI 

PVGKIYFAFNPRLCLEHIYRLEEVTGTRGRQN 

KAEINPRTNGDRAACQTRTLRFVSNVTEADRT 

LLRWERYEPLEARDLLSFIVYYKESPFQNATE 

HVGPDACGTQSWNLLDVELPLSRTQEPGVTL 

ASLKPWTQYAVFVRAITLTTEEDSPHQG AQS ■ 

P1VYLRTLPAAJTVPQDV1STSNSSSHLLVRW 

KPPTQRNGNLTYYLVLWQRLAEDGDLYLND 

YCHRGLRLPTSNNDPRFDGEDGDPEAEMESD 

CCPCQHPPPGQVLPPLEAQEASFQKKFENFLH 

NAITIPISPWKVTSINKSPQRDSGRHRRAAGPL 

RLGGNSSDFEIQEDKVPRERAVLSGLRHFTEY 

RIDIHACNHAAHTVGCSAATFVFARTMPHRE 

ADG1PGKVAWEASSKNSVLLRWLEPPDPNGL 

ILKYEIKYRRLGEEATVLCV SRLRYAKFGG V 

HLALLPPGNYSARVRATSLAGNGSWTDSVAF 

Y1LGPEEEDAGGLHVLLTATPVGLTLLIVLAA 

LGFFYGKKRNRTLYASVNPEYFSASDMYVPD 

EWEVPREQISI1RELGQGSFGMVYEGLARGLE 

AGEESTPVALKTVNELASPREC1EFLKEASVM 

KAFKCHHVVTUXGWSQGQPTLVIMELMTR 

GDLKSHLRSLRPEAENNPGLPQPALGEMJQM 

AGEIADGMA YLAANKFVHRDL AARNCM V SQ 

DFTVKlGDFGMTRDVYbl DYYKKUCjKuLLr 

VRWMAPESLKDGIFTTHSDVWSFGVVLWE1V 

TLAEQPYQGLSNEQVLKFVMDGGVLEELEGC 

PLQLQELMSRCWQPNPRLRPSFTHILDSIQEEL 

RPSFRLLSFYYSPECRGARGSLPTTDAEPDSSP 

TPRDCSPQNGGPGH 


723 


2073 


A 


5672 


1 


216 


LA WI DNELPEKEKKETDKKRKRKKG AHEDCD 
EEPQFPPPSVIK1PMESVQSDPQNGIHCIAJRJCR 

SSSWSYSL 


724 


2074 


A 


5704 


4235 


940 


ARGRRSRPVWAASWGGRGRPAARRRPRGLA 

ATMGFELDRFDGDVDPDLKCALCHKVLEDP 

LTTPCG HVFCAG CVLP W WQEG SCP ARCRGR 

LS AKELNH VLPLKRL ILKLDIKC A Y ATRG CGR 

VVKLQQLPEHLERCDFAPARCRHAGCGQVLL 

RRDVEAHMRDACDARPVGRCQEGCGLPLTH 

GEQRAGGHCCARALRAJ-INGALQARLGALHK 

ALKKEALRAGKREKSLVAQLAAAQLELQMT 

ALRYQKKFTEYSARLDSLSRCVAAPPGGKGE 

ETKSLTLVLHRDSGSLGFNIIGGRPSVDNIIDG 

SSSEGIFVSKIVDSGPAAKEGGLQIHDRIIEVN 

GRDLSRATHDQAVEAFKTAKEPIVVQVLRRT 

PRTKJvlFTPPSESQLVDTGTQTDITFEHIMALT 

KMSSPSPPVLDPYLLPEEHPSAHEYYDPNDY1 

GDIHQEMDREELELEEVDLYRMNSQDKLGLT 

VCYRTDDEDDIGIYISEIDPNSIAAKDGRTREG 

DRI1QINGIEVQNREEAVALLTSEENKNFSLLI 

a pa pi or nFf;wMDDDRNr>Fi DDI HMDMLE 

EQHHQAMQFTASVLQQKKHDEDGGTTDTAT 

ILSNQHEKDSGVGRTDESTRNDESSEQENNG 

DDATASSNPLAGQRKLTCSQDTLGSGDLPFS 

NESFISADCTDADYLGIPVDECERFRELLELK 

CQVKSATPYGLYYPSGPLDAGKSDPESVDKE 

LELLNEELRSIELECLSIVRAHKMQQLKEQYR 

ESWMLHNSGFRNYNTSIDVRRHELSDITELPE 

KSDKDSSSAYNTGESCRSTPLTLEISPDNSLRR 
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725 



2075 
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nucleotide 
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amino acid 

residue of 

peptide 

sequence 



5707 



726 



2076 



727 



2077 



728 



2078 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



1770 



Amino acid sequence (A=AIanine OCysteine, 
D=Aspar1ic Acid, E=Glutamic Acid, 
F=PhenylaIanine, G-GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V^Valine, W^Tryptophan, 
Y=*Tyrosine, X=Unknown, *«Stop codon, 
/^possible nucleotide deletion, V=possibIe 
nucleotide insertion 



AAEGI S CPSSEGAVGTTEAYGPASKNLLSITE 

DPEVGTPTYSPSLKELDPNQPLESKERRASDG 

SRSPTPSQKLGSAYLPSYHHSPYKHAHIPAHA 

QHYQSYMQLIQQKSAVEYAQSQMSLVSMCK 

DLSSPTPSEPRMEWKVKIRSDGTRYITKRPVR 

DRLLRERALKIREERSGMTTDDDAVSEMKM 

GRYWSKEERKQHLVKAKEQRRJRREFMMQSR 

LDCLKJEQQAADDRXEMNILELSHKKMMKKR 

NKKIFDNWMTIQELLTHGTX SPDGTRV YNSF 

LSVTTV 



5711 



5716 



5737 



729 



2079 



156 



1899 



5741 



423 



274 



649 



5976 



QI STEVSEAPV ANDKPKTLWKVQKKAADLP 

DRDTWKGRFDFLMSCVGYAJGLGNVWRFPY 

LCGKNGGGAFLIPYFLTLIFAGVPLFLLECSLG 

QYTSIGGLGVWKLAPMFKGVGLAAAVLSFW 

LNIYYIVnSWAIYYLYNSFTTTLPWKQCDNP 

WNTDRCFSNYSMVNTTNMTSAVVEFWERN 

MHQMTDGLDKPGQIRWPLAITLAIAWILVYF 

CIWKGVGWTGKWYFSATYPYIMLIILFFRGV 

TLPGAKEGILFYITPNFRKLSDSEVWLDAATQ 

IFFSYGLGLGSLIALGSYNSFHNNVYRDSIIVC 

CINSCTSMFAGFVIFSIVGFMAHVTKRSIADV 

AASGPGLAFLAYPEAVTQLPISPLWAILFFSM 

LtMLGIDSQFCTVEGFITALVDEYPRLLRNRJR 

ELFIAAVCIISYUGLSNITQGGIYVFKLFDYYS 

ASGMSLLFLVFFECVSISWFYGVNRFYDNIQE 

MVGSRPCIWWKLCWSFFTPIIVAGVFIFSAVQ 

MTPLTMGNYVFPKWGQGVGWLMALSSMVL 

IPGYMAYMFLTLKGSLKQRIQVMVQPSEDIV 

RPENGPEQPQAGSSTSKEAYI 

PRRDPGRTPELRGSAPRKTGANMPVRRGHVA 
PQNTFLGTIIRKFEGQNKKFIIANARVQNCAII 

YCNDGFCEMTGFSRPDVMQKPCTCD 

HASEYFFKLCSFQVFLSFPLATIVIDVGLWIP 
LVKSPNVHYVYVLLLVLSGLLFYIPLIHFKIRL 

AWFEKMTCYLQLLFNICLPDVSEE 

IQASRASPYPRVKVDFALSCHEDLLAPISEPIE 

WKYHSPEEEISLGPACWLWDFLRRSQQAGFL 

LPLSGGVDSAATACLIYSMCCQVCEAVRSGN 

EEVLADVRTIVNQISYTPQDPRDLCGRILTTC 

YMASKNSSQETCTRARELAQQIGSHHISLNID 

PAVKAVMGIFSLVTGKSPLFAAHGGSSRENL 

ALQNVQARIRMVLAYLFAQLSLWSRGVHGG 

LLVLGSANVDESLLGYLTKYDCSSADINPIGG 

ISKTDLRAFVQFaQRFQLPALQSlLLAPATAE 

LEPLADGQVSQTDEEDMGMTYAELSVYGKL 

RKVAKMGPYSMFCKLLGMWRHICTPRQVAD 

KVKI^FSKYSMNRHKMTTLTPA YHAEN Y SPE 

DNRFDLRPFL YNTS WPWQFRCIEN Q VLQLER 

AEPQSLDGVD 



PGCAARL SRARAPGPG AAG AGRKRL ADPGPP 

PASRRLRAPGSRPRLAPCTRRAAQPAHARMA 

PRAAGGAPLSARAAAASPPPFQTPPRCPVPLL 

LLLLLG AARAG ALEI QRRFPSPTPTNNF ALDG 

AAGTVYLAAVNRLYQLSGANL SLEAEAAVG 

PVPDSPLCHAPQLPQ A SCEHPRRLTDN YNKIL 

QLDPGQGLWVCGSIYQGFCQLRRRGNISAV 

AVRFPPAAPPAEPVTVFPSMLNVAANHPNAS 

TVGLVLPPAAGAGGSRLLVGATYTGYGSSFF 

PRNRSLEDHRFENTPEIAIRSLDTRGDLAKLFT 
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T^Threonine. V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, ^possible 
nucleotide insertion 














FDLNPSDDNILKIKQGAKEQHKLGFVSAFLHP 

SDPPPGAQSYAYLAXNSEARAGDKESQARSL 

LARICLPHGAGGDAKKLTESYIQLGLQCAGG 

AGRGDLYSRLVSVFPARERLFAVFERPQGSPA 

ARAAPAALCAFRFADVRAAIRAARTACFVEP 

APDVVAVLDSWQGTGPACERKLN1QLQPEQ 

LDCGAAHLQHPLSILQPLKATPVFRAPGLTSV 

AVASVNNYTAVFLGTVNGRLLKINLNESMQ 

VVSRRWTVAYGEPVHHVMQFDPADSGYLY 

LMTSHQMARVKVAACNVHSTCGDCVGAAD 

AYCGWCALETRCTLQQDCTNSSQQHFWTSA 

SEGPSRCPAMTVLPSEIDVRQEYPGMILQISGS 

LPSLSGMEMACDYGNNIRTVARVPGPAJFGHQ 

I A YCNLLPRD QFPPFPPN QDHVTV EMSVRVN 

GRNI VKANFTI YDCS RTAQVYPHTACTSCLS A 

QWPCFWCSQQHSCVSNQSRCEASPNPTSPQD 

CPRTLLSPLAPVPTGGSQNILVPLANTAFFQG 

AALECSFGLEEIFEAVWVNESVVRCDQWLH 

TTRKSQVFPLSLQLKGRPARFLDSPEPMTVM 

VYNCAMGSPDCSQCLGREDLGHLCMWSDGC 

RLRGPLQPMAGTCPAPEIRAIEPLSGPLDGGT 

LLT1RGRNLGRRLSDVAHGVWIGGVACEPLP 

DRYTVSEEIVCVTGPAPGPLSGWTVNASKE 

GKSRDRFSYVLPLVHSLEPTMGPKAGGTRTTI 

HGNDLHVGSELQVLVNDTDPCTELMRTDTSI 

ACTMPEGALPAPVPVCVRFERRGCVHGNLTF 

WYMQNPVITAISPRRSPVSGGRTITVAGERFH 

MVQNVSMAVHHIGREPTLCKVLNSTLITCPSP 

GALSNASAPVDFFINGRAYADEVAVAEELLD 

PEEAQRGSRFRLDYLPNPQFSTAKREKWIKH 

HPGEPLTLVIHVSTKGAGKEQDSLGLQSHEY 

RVKJGQVSCDIQIVSDRIIHCSVNESLGAAVGQ 

LPITIQVGNFNQTIATLQLGGSETAUVSIVICSV 

LLLLSVVALFVFCTKSRRAERYWQKTLLQME 

EMESQ1 REEIRKGFAELQTDMTDLTKELNRSQ 

GIPFLEYKHFVTRTFFPKCSSLYEERYVLPSQT 

LNSQG SSQAQETHPLLG EWKIPESCRPNMEE 

GI SLFSSLLDNKHFLI VFVHALEQQKDFAVRD 

RCSLASLLTIALHGKLEYYTSIMKELLVDLID 

ASAAKNPKLMLRRTESVVEKMLTNWMSICM 

Y SCLRETV GEPFFLLLCAIKQQINKGSID AITG 

KARYTLNEEWLLRENIEAKPRNLNVSFQGCG 

MDSLSVRAMDTDTLTQVKEKILEAFCKNVPY 

SQWPRAEDVDLEWFASSTQSY1LRDLDDTSV 

VEDGRKKLN TLAHYKIPEGASLAMSLIDKKD 

"kTTT /™ , T) \ Tt/ TW r^TTTTX IT \7T F1TTYCT A 1"? Til/ V P IJ 

NTLORVKDLD 1 EK YrHLVLP \ DhLAhrKKbH 
RQSHRKKVLPEIYLTRLLSTKGTLQKFLDDLF 
KAILSIREDKPPLAVKYFFDFLEEQ AEKRGI SD 
PDTLHrWKTNSLPLRFWVNILXNPQFVFDIDK 

Tr\i.nn a r-n ci/i a r\ a a pcicni AT f n vrvQT>T*KT 
1 uH\D ACLo V lAQAr IL/ACol oJJLQLUKUor J JN 

KLLYAKEIPEYRKIVQRYYKQIQDMTPLSEQE 
MNAVTT AFFSRlCYONFFTvTTMVAMAFIYKYAK 

RYRPQIMAALEANPTARRTQLQHKFEQVVAL 
MEDNIYECYSEA 


730 


2080 


A 


5744 


3 


292 


QPSPLFHSHLETLQLLRTAQLPEQVSWPWGQ 
VANGKGNQRNMGSPQPSLLAFERNLELQIMG 
LGYSLLMGKLRPRVAKDTLRVHRDSTPSPLT 
LKD 


731 


2081 


A 


5747 


1 


382 


FLKCMRKAFRSSKLLQVGYTPDGKDDYRWC 
FRVDEVNWTTWNTNVGIINEDPGNCEGVKRT 



232 



WO 01/57188 



PCT/US01/03800 



NO: of 
nucl- 

ViUUUv 

seq- 
uence 


ccn in 
ID 

NO: of 
peptide 

uence 


JVict 

hod 


ID NO: 
in 

UoOiN 

09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 

no t/\ fivct 
Ug IU llrbl 

amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
aciu icbiuuc 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
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nucleotide insertion 














LSFSLRSSRVSGRHWKNFALVPLLREA SARD 

RQSAQPEEVYLRQFSGSLKPEDAEVFKSPAAS 
GEK 


732 


2082 


A 


5753 


198 


3 


AQAES STVASPEATAGPLCTRIPN VPPPTPIRP 
PGKLQAQLPCPSPVRFTSAR1PPASRPQTKS 


733 


2083 


A 


5754 


2 


2223 


AAGPPGLEAEGRAPESAGPGPGGDAAETPGL 

PPAHSGTLMMAFRDVTVQIANQNISVSSSTAL 

SVANCLGAQTVQAPAEPAAGKAEQGETSGR 

EAPEAPAVGREDASAEDSCAEAGASGAADG 

ATAPKTEEEEEEEETAEVGRGAEAEAGDLEQ 

LNRTSTSTKSAKSGSEASASASKDALQAMILS 

LPRYHCENPASCKSPTLSTDTLRKRLYRJGLN 

LFNINPDKGIQFLISRGFIPDTPIGVAHFLLQRK 

GLSRQMIGEFLGNSKKQFNRDVLDCWDEM 

DFSSMELDEALRKFQAHIRVQGEAQKVERLIE 

AFSQRYCMCNPEWQQFHNPDTIFILAFAIILL 

NTDMYSPNIKPDRKMMLEDFIRNLRGVDDG 

ADIPRELVVGIYERIQQKELKSNEDHVTYVTK 

VEKSIVGMKTVLSVPHRRLVCCSRJLFEVTDV 

NKLQKQAAHQREVFLFNDLLVILKLCPKKKS 

SSTYTFCKSVGLLGMQFQLFENEYYSHGITLV 

TPLSGSEKKQVLHFCALGSDEMQKFVEDLKE 

S1AEVTELEQIRIEWELEKQQGTKTLSFKPCGA 

QGDPQSKQGSPTAKREAALRERPAESTVE V SI 

HNRLQTSQHNSGLGAERGAPVPPPDLQPSPPR 

QQTPPLPPPPPTPPGTLVQCQQIVKVIVLDKPC 

LARMEPLLSQALSCYTSSSSDSCGSTPLGGPG 

SPVKVTHQPPLPPPPPPYNHPHQFCPPGSLLH 

GHRYSSGSRSLV 


734 


2084 


A 


5788 


8 


362 


SSVMGDLVGQGLEEQIVARDENSWLIDGGTP 
IDDVMRVLDIDEFPQSGNYETIGGFMMFMLR 
KJPKRTDSVKFAGYKFEVVDIDNYRIDQLLVT 
RIDSKATALSPKLPDAKDKEESVA 


735 


2085 


A 


5827 


1 


1257 


MVFSAVLTAFHTGTSNTTFVVYENTYMNITL 

PPPFQHPDLSPLLRYSFETMAPTGLSSLTVNST 

AVPTTPAAFKSLNLPLQITLSAIMIFILFVSFLG 

NLVVCLMVYQKAAMRSAINILLASLAFADM 

LLAVLNMPFALVTILTTRWIFGKFFCRVSAMF 

FWLFVIEGVAILLIISIDRFLIIVQRQDKLNPYR 

A&vLlAvbVfA I arLVA>PLAVCjNPDLQIPSRA 

PQCVFGYTTNPGYQAYVILISLISFFIPFLVILY 

SFMGILNTLRHNALRIHSYPEGICLSQASKLGL 

MGLQRPFQMSIDMGFKTRAFTTILILFAVFIVC 

WAPFTTYSLVATFSKHFYYQHNFFEISTWLL 

WLCYLK SALNPLIYY WRIKKFHDACLDMMP 

fvorivr L-rlJl-frvjli 1 RJKKiKrbA V Y VCObHK I VV 


736 


2086 


A 


5870 


3 


268 


FTRSDELARHYRTHTGEKRFSCPLCPKQFSRS 
DHLTKHARRHPTYHPDMIEYRGRRRTPRIDPP 
LTSEVES SASGSGPGPAPSFTTCL 


737 


2087 


A 


5871 


2 


521 


T TWPOT FT FTI PPI 1 TJTV/TQTJP ATJrV^DQIV'l AT \ro 

RSSSLGYISKAEEYFLLKSRSDLMFEKQSERH 
GL ARRLTTARRPPAS SEQAQQELFNELKPA V 
DGANFtVNHMRDQNNYT^EEKDSWNRVART 
VDRLCLFWTPVMWGTAWIFLQGVYNQPPP 
QPFPGDPYSYNVQDKRFI 


738 


2088 


A 


5881 


1 


1160 


LVVTAITAILAFPNEYTRMSTSELISELFNDCG 
LLDS SKLCD YENRFNTSKGGELPDRPAG VG V 
YSAMWQLALTLILKIVIUFTFGMKIPSGLFIPS 
MAVGAIAGRLLGVGMEQLAYYHQEWTVFNS 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NU: Ol 
peptide 
seq- 
uence 


Met | 
no<j 


SEQ 
IJJ NU. 
in 

USSN 

09/496 

914 


Predicted 
t>e ginning 
nucleotide 
location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucieouue 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 

I~tr A Cnortir ApiH PssOliitfllYlie AftH 

l_y /AoLJ<ll L!^ Av IUj L_/ VJ 1 LI UX111 i ^- / V^lUj 

F=Pheny] alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P-Proline, 
Q=Glutamine, R=Arginine, S=Serine t 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 












■ 


WCSOfiADOTTPfil YAMVGAAACLGGVTRMT 

VSLWIMFELTGGLEYIVPLMAAAMTSKWVA 

DALGREGIYDAH1RLNGYPFLEAKEEFAHKTL 

AMDVMKPRRNDPLLTVLTQDSMTVEDVETII 

SETTYSGFPVWSRESQRLVGFVLRRDLIISIE 

NARKKQDGVVSTSIIYFTEHSPPLPPYTPPTLK 

LRNILDLSPFTVTDLTPMEIVVDIFRKLGLRQC 

LVTHNGRLLGI1TKKDVLKHIAQMANQDPDSI 

LFN 


739 


2089 


A 


5892 


2 


916 


TLQLAASVPFFAISLISWWLPESARWLIINGKP 

KEEVASAKEPRSVLDLFCVPVLRWRSCAMLV 
VNFSLLISYYGLVFDLQSLGRDIFLLQALFGA 
VDFLGRATTAL LLSFLGRRTIQAGSQAMAG L 
AILANMLVPQDLQTLRVVFAVLGKGCFGISL 
rn ttvitaft FPm>vRK4TAnnn T-rrvriRT ga 

MMGPLILMSRQALPLLPPLLYGVISIASSLVVL 

FFLPETQGLPLPDTIQDLESQKSTAAQGNRQE 

AFTVESTSLLEIVALHGAL 


740 


2090 


A 


5900 


2 


426 


RPIKTLGIGFHFSVDGVHFLTQREVQNLWKE 

KCGCHFHEVVKSKLSKEYNFIKMKRSRNHIM 
GRYFSNQSKLQQGTVTNFRSPYHVRGPINQV 
CSEILLSRMCANKRTM 


741 


2091 


A 


5910 


3 


412 


RMPESTLLIICENGYILEAPLFTIKQEEDDHDV 
VSYEIKDMCIKCFHFSSVKSKILRLIEIEKRER 

HPPT lifFlif TPFFPP>Jfn A AFMfiFnnFKFFnFF 

EEEKEEEEEEEEPLPEIFIPSTPSPILCGFYSEPG 
KFWV 


742 


2092 


A 


5936 


1 


482 


MGCRLLCCVVFCLLQAGPLDTAVSQTPKYLV 

TQMGNDKSIKCEQNLGHDTMYWYKQDSKK 

FLKIMFSYNNKJELUNF I'VPNRFSPKSPDKAHL 

NLHINSLELGDSAVYFCASSQDTALQSHCIPV 

HKPPGSARKLQGSVCTCTQGSSLHSLMASDG 

VPVC 


743 


2093 


A 


5938 


1 


1566 


MNSFFGTPAASWCLLESDVSSAPDKEAGRER 

RALSVQQRGGPAWSGSLEWSRQSAGDRRRL 

GLSRQTAKSSWSRSRDRTCCCRRAWWILVPA 

ADRARRERFINfNEKWDTNSSENWHPIWNVN 

DTKJHOHX.YSDINITYVNYYLHQPQVAAIFIISYF 

LIFFLCMMGNTVVCFrVMRNKHMHTVTNLFI 

LNLA1SDLLVGIFCMPITLLDNIIAGWPFGNTM 

CKISGLVQGISVAASVFTLVAIAVDRFQCWY 

PFTfPVT TTk'TAFVTTMTTWVT A ITTM <s P<s A VMT H 

VQEEKYYRVRLNSQNKTSPVYWCREDWPNQ 

FMTJtfl VTTVT FAXIT VI APT ST TVTMYfTRTfil'sT F 

RAAVPHTGRKNQEQWHWSRKKQKIIKMLLI 

VALLFILSWLPLWTLMMLSDYADLSPNELQII 

NIYIYPFAHWLAFGNSSVNPIIYGFFNENFRRG 

FQEAFQLQLCQKJIAKPMEAYALKAKSIIVLIN 

TSNQL VQESTFQNPHGETLL YRKSAEKPQQE 

LVMEELKETTNSSEI 


744 


2094 


A 


5966 


149 


327 


SHVCVSHYAGSSGCPAGAGAGAVAXGISAVA 
LYDYQGGRLGVARGAWYMEAPDIRQGDM 


745 


2095 


A 


5970 


413 


856 


GAPHTDWAWAPTPMSGLGSGRGRQGTLASS 

PLSLPLLLAGVTGILATELFDQMARPAACMV 

CGALMWIMLILVGLGFPHMEALSHFLYVPFL 

GVCVCGAIYTGLFLPETKGKTFQE1SKJELHRL 

NFPRRAQGPTWRSLEVIQSTEL 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO- 
in 

USSN 

09/496 

914 


Predicted 

Wfi 11 11 1 1 1 1 If 

nucleotide 
location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
uucicouuc 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, 

L^/YapaniC /IClu, C/ — LJlUloITllC /\C1Q } 

Phenylalanine, G=Glycine, H^Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine s X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, possible 
nucleotide insertion 


746 


2096 


A 


5971 


3 


1343 


AQTARRIIGLELDTEGHRLFVAFSGCIVYLPLS 

RCARHGACQRSCLASQDPYCGWHSSRGCVDI 

RGSGGTD VDQAGNQE SMEHGDCQDGATGSQ 

SGPGDSAYGVRRDLPPASASRSVPIPLLLASV 

AAAFALGASVSGLLVSCACRRAHRRRGKDIE 

TPGLPRPLSLRSLARLHGGGPEPPPPSKDGDA 
VOTPOI YTTFT PPPFflVPPPFI APT PTPPCTPP 

LPVKHLRAAGDPWEWNQNRNNAKEGPGRSR 

GGHAAGGPAPRVLVRPPPPGCPGQAVEVTTL 

EELLRYLHGPQPPRKGAEPPAPLTSRALPPEP 

APALLGGPS PRPHECASPLRLDVPPEGRCAS A 

PARPALSAPAPRLGVGGGRRLPFSGHRAPPAL 

LTRVPSGGPSRYSGGPGKHLLYLGRPEGYRG 

RALKRVDVEKPQLSLKPPLVGPSSRQAVPNG 

GRFNF 


747 


2097 


A 

fx. 




o 


ISA 


"P\TJ ACT nPOH rviT TT>T?TVt /TyTTun rm/iritm/i «-vt rm> 

l/HAixLrUb WfsrlKrUVlii KJiVMODHSGQVTI 

LKLEQENCTLVTTFRGHTGGVTALCWDPVQ 

RVLFSGSSDHSVIMWDIGGRKGTAIELQGHN 

DRVQALSYAQHTRQLISCGGDGGIVVWNMD 

VERQETPEWLDSDSCQKCDQPFFWNFKQMW 

DSKKIGLRQHHCRKCGKAVCGKCSSKRSSIPL 

MGFEFEVRVCDSCHEAITDEERAPTATFHDSK 

HNIVHVHFDATRGWLLTSGTDKVIKLWDMT 

r V V o 


748 


2098 


A 


6001 


2 


747 


AMVFGGVVPYVPQYRDIRRTQNADGFSTYV 
CLVLLVANILRILFWFGRRFESPLLWQSAIM1L 
TMLLMLKLCTEVRVANELNARRRSFTAADS 
KDEEVKVAPRRSFLDFDPHHFWQWSSFSDYV 

OPVI AT7TnVA nVTTVT CTTVIAT Cl/cri /-> ttt aw 

LTEAMLGVPQLYRNHRHQSTEGMSIKMVLM 

WTSGDAFKTAYFLLKGAPLQFSVCGIXQVLV 

DLAILGQAYAFARHPQKPAPHAVHPTGTKAL 


749 


2099 


A 


6002 


2 


447 


GRPDR SELVRMHILEETF AEPSLQATQMKLK 

RAPT AnT>I XFPT^tAOPPrSPNjfPT VUVXTIT D\moc 

VKEAnGVGKEDYPHTQGDFSFDEDSSDALSP 
DQPASQESQGSAASPSEPKVSESPSPVTTNTP 
AQFAS VS PTVPEFLKTPPTAD 


750 


2100 


A 


6004 


2 


427 


LLT Q AM L VL PHRPQ W FTPG PRLQ A QG PCQ EG 

WRWELRLRNYVPEDEDLNKRRVPQAKPDAV 

QEKVKEQLEAAKPEPVIEEVDLAKLAPRKPD 

WDLKRDVAKKXEKLLKRTQRAIAELIRERLK 

GQEDSLDSAVDAATEHKTC 


751 


2101 


A 


6007 


33 


1280 


TDQAKVDNQPEKLVRSAEDVSTVPTQPDNPF 

SHPDKLKRMSKSVPAFLQDESDDRETDTASE 

SSYQLSRHKKSPSSLTNLSSSSGMTSLSSVSGS 

VMSVYSGDFGNLEVKGNIQFAEEYVESLKEL 

HVFVAQCKDLAAADVKKQRSDPYVKAYLLP 

l^ixUKiVioJSJVlvi L V V Jvxvi LiMrV iNrJLKYK,lEK 

QILKTQKLNLSIWHRDTFKRNSFLGEVELDLE 

TWDWDNKQNKQLRWYPLKRKTAPVALEAE 
NRGEMKJLALOYVPEPVPGKKLPTTGF.VHIWV 

KECLDLPLLRGSHLNSFVKCTILPDTSRKSRQ 

KTRAVGKTTNPIFNHTMVYDGFRPEDLMEAC 

VELTVWDHYKLTISrQFLGGLRIGFGTGKSYGT 

EVDWMDSTSEEVALWEKMVNSFNTWIEATL 

PLRMLLIAKISK 


752 


2102 


A 


6028 


108 


1283 


KEIFSPFELISVKPLCLLLGVTCSQSMAFEELL 

SQVGGLGRFQMLHLVFILPSLMLLIPHILLENF 

AAAIPGHRCWVHN4LDNNTGSGNETGILSEDA 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


1 SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine t L-Leucine, 
M=Methionine, N~Asparagine, P^ProIine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y>=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LLRISIPLDSNLRPEKCRRFVHPQWQLLHLNG 

TIHSTSEADTEPCVDGWVYDQSYFPST1VTKW 

DLVCDYQSLKSWQFLLLTGMLVGGIIGGHV 

SDRFGRRFILKWGLLQLAITDTCAAFAPTFPV 

YCVLRFLAGFSSMIIISNNSLPITEWIRPNSKAL 

WILSSGALNIGQIILGGLAYVFRDWQTLHVV 

ASVPFFVFFLLSRWLVESARWLIITNKLDEGL 

KALRKVARTNGIKNAEETLNIEVVRSTMQEE 

LDAAQTKTTVWDLFRNPSMRKRICILVFLRK 

KNLKJEKA 


753 


2103 


A 


6043 


1 


1470 


DSFESILRLIFEIHHSGEKGDIVVFLACEQDIEK 

VCETVYQGSNLNPDLGELWVPLYPKEKCSL 

FKPLDETEKRCQVYQRRWLTTSSGEFLIWSN 

SVRFVIDVGVERRKVYNPRIRANSLVMQPISQ 

SQAEIRKQILGSSSSGKFFCLYTEEFASKDMTP 

LKPAEMQEANLTSMVLFMKRIDIAGLGHCDF 

MNRPAPESLMQALEDLDYLAALDNDGNLSE 

FGIIMSEFPLDPQLSKSILASCEFDCVDEVLTIA 

AM VTAPN CF S H VPHG AEEAALTC W KTF L H PE 

GDHFTLISIYKAYQDTTLNSSSEYCVEKWCRD 

YFLNCSALRMADVIRAELLEIIKRJELPYAEPA 

FGSKENTLNIKKALLSGYFMQIARDVDG SGN 

YLMLTHKQVAQLHPLSGYSITKKMPEWVLF 

HKFSISENNYIRITSEISPELFMQLVPQYYFSNL 

PPSESKDILQQ WDHLSPVSTMNKEQQM CET 

CPETEQRCTLQ 


754 


2104 


A 


6055 


2 


394 


YYALHHWPFPDLLCQTTGAIFQMNMYGSCIF 
LMLINVDRYAAIVHPLRLRHLRRPRVARLLC 
LGVWALILVFAVPAARVHRPSRCRYRDLEVR 
LCFESFSDEL WKGRIXPL VLL AEALG FLLPL A 
AVVYSS 


755 


2105 


A 

f 

i 

i 

i 


6059 


3 


1795 


LGLGSGTLLSVSEYKKKYREHVLQLHARVKE 

RNARSVKJTKRFTKLL1APESAAPEEALGPAEE 

PEPGRARRSDTHTFNRLFRRDEEGRRPLTVVL 

QGPAGIGKTMAAKKILYDWAAGKLYQGQVD 

FAFFMPCGELLERPGTRSLADLILDQCPDRGA 

PVPQMLAQPQRLLFILDGADELPALGGPEAAP 

CTDPFEAASGARVLG GLLSKALLPTALLL VTT 

RAAAPGRLQGRLCSPQCAEVRGFSDKDKKK 

YFYKFFRDERRAERAYRFVKENETLFALCFV 

PFVCWIVCTVLRQQLELGRDLSRTSKTTTSVY 

LLFITSVLSSAPVADGPRLQGDLRNLCRLARE 

GVLGRRAQFAEKELEQLELRGSKVQTLFLSK 

KELPGVLETEVTYQFIDQSFQEFLAALSYLLE 

DGGVPRTAAGGVGTLLRGDAQPHSHLVLTT 

RFLFGLLSAERMRDIERHFGCMVSERVKQEA 

LRW VQGQG QGCPG VAPEVTEGAKGLEDTEE 

PEEEEEGEEPNYPLELLYCLYETQEDAFVRQA 

LCRFPELALQRVRFCRMDVAVLSYCVRCCPA 

GQALRLISCRLVAAQEKKKKSLGKRLQASLG 
GG 


756 


2106 


A 

> 


6060 


12 


436 


SGRPTRPAKPTGQGMGRFMLTLVCQGSIMMS 

ARDLIMNNLTELQPGLFHHLRFLEELRLSGNH 

LSH1PGQAFSGLYSLKJLMLHNNQLGGIPAQA 

LWELPSLQSLRLDANLISLVPERSFEGLSSLRH 

LWLDDNALTEIPS 


757 


2107 


A 


6063 


54 


419 


ITPLGLGAADMCAFPWLLLLLLLQEGSQRRL 
WRWCGSEEWAVLQESISLPLEIPPDEEVENII 
WSSHKSLATVVPGKEGHPATIMVTNPHYQG 
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CPO ID 


SFO TT) 


Met 


CCA 




rreaictea ena 


Amino acid sequence (A=AIanine C=Cysteine, 




NO* nf 


11LHJ 


iiv IN 


Deginnmg 


nucleotide 


D = Aspartic Acia, E=OIutamic Acid, 


micl- 

lllivl 


nentide 




tti 
in 




I /vr» rtTi 

1 OCallOIl 


r^rneny j aianme, Lr^jriycine, n^riistiuine, 


eotide 


seq- 




USSN 


location 


corresponding 


I-Isoleucine, K=Lysine, L^Leucine, 


seq- 


uence 




09/496 


correspond i 


to last amino 


M=Methionine, N=Asparagine, P=Proline, 


uence 






914 


ng to first 


acid residue 


Q=Glutamine, R=Arginine, $=Serine, 










aminn aciH 




T"— ThtTPfHI in^ \/— Wot \r\p- Trv/T-»tr\r\V>or» 

i ~ i iiituiiiiic, v — vdiinc, w — i rypiopxiciii, 

Y=Tyrosine, X=Unknown, *=Stop codon, 










residue of 


sequence 










peptide 




/=possibIe nucleotide deletion, V=possible 










sequence 




nucleotide insertion 














QILTMLLRSLQQPSASWPRDCSSSCSW 


758 


2108 


A 


6066 


125 


438 


IGI SCPATIFVPMFSH SLIGIGEEYQLPYYNMV 
PSDPSYEDMREWCVKRLRPIVSNRWNSDEC 

MVESQDVKI 


759 


2109 


A 


6072 


3 


650 


PGRRFRPAALEERAMEKLREKVPFQNRGKGT 

LSSIIPNNSDTRKATETTSLSSKPEYVNPDFRW 

SKDPSSKSGNLLETSEVGWTSNPEELDP1RLA 

LLGKSGLSCQVGSATSHPVSCQEPIDEDQRISP 

KDKSTAGREFSGQVSHQTTSENQCTPIPSSTV 

HSSVADMQNMPAAVHALLTQPSLSAAPFAQ 

KYL-OlLro J Ot>J JlJ^CHAuNATVW 


760 


2110 


A 


6077 


3 


730 


PLRLTLMEEVLLLGLKDREGYTSFWNDCISSG 

LRGCMLIELPLRGRLQLEACGMRRKSLLTRJK 

VICKSDAPTGDVLLDEALKHVKETQPPETVQ 

NWIELLSGETWNPLKLHYQLRNVRERLAKNL 

V bitOVLT 1 EKQNFLLFDMTTHPLIT^O^NIKQR 

LIKKVQEAVLDKWVNDPHRMDRRLLAL1YL 

AHASDVLENAFAPLLDEQYDLATKRVRQLLD 

LDPEVECLKANTNEVLWAWAAFTK 


761 


2111 


A 


6078 


833 


390 


IVSFHLSGFKKFVRPFSFLSVHGLQVDEYHSV 
HQKXSADMADHSNLIRSLLVGAEDARLMRD 
MKTMKSRYMELYDLNRD1LNGYKJRWNNH 
TELLGNLKAVNQAIQRAGRLRVGKPKNQVIT 
ACRDAIRSNNINTLFK1MRVGTASS 


762 


2112 


A 


6079 


2 


2686 


KKAITCGEKEKQDLIKSLAMLKDGFRTDRGS 

HSDLWSSSSSLESSSFPLPKQYLDVSSQTDISG 

SFGIM SNNQLAEKVRLRLRYEE AKRRI ANLKI 

QLAKLDSEAWPGVLDSERDRLILINEKEELLK 

EMRF1SPRKWTQGEVEQLEMARKRLEKDLQ 

AARDTQSKALTERLKLNSKRNQLVRELEEAT 

RQVATLHSQLKSLSSSMQSLSSGSSPGSLTSSR 

GSLVASSLDSSTSASFTDLYYDPFEQLDSELQ 

SKVEFLLLEGATGFRPSGCITTIHEDEVAKTQ 

KAEGGGRLQALRSLSGTPKSMTSLSPRSSLSS 

PSPPCSPLMADPLLAGDAFLNSLEFEDPELSA 

TLCELSLGNSAQERYRLEEPGTEGKQLGQAV 

NTAQGCGLKVACVSAAVSDESVAGDSGVYE 

ASVQRLGASEAAAFDSDESEAVGATR1QIALK 

YDEKNKQFAIUIQLSNLSALLQQQDQKVNIR 

VAVLPCSESTTCLFRTRPLDASDTLVFNEVFW 

VSMSYPALHQKTLRVDVCTTDRSHLEECLGG 

AQISLAEVCRSGERSTRWYNLLSYKYLKKQS 

RELKPVGVMAPASGPASTDAVSALLEQTAVE 

LEKRQEGRSSTQTLEDSWRYEETSENEAVAE 

bbbbbvbbbbObEDVr 1 bKASPDMDGYPALK 

VDKETNTETPAPSPTVVRPICDRRVGTPSQGPF 

LRGSTIIRSKTFSPGPQSQYVCRLNRSDSDSST 

LSKKPPFVRNSLERRSVRMKRPSPPPQPSSVK 

SLRSERLIRTSLDLELDLQATRT WH SQLTQ EIS 

VLKFI KFOl EOAlCSHnFXFT POW7 RFHPT?FP 

LLLRMLEKRMDRAEHMGELQTDKMMRAAA 
KDVHRLRGQSCKEPPEVQSFREKMAFFTRPR 
MNIPALSADDV 


763 


2113 


A 


6082 


3 


1558 


PHPIRFSKLCVSFNNQEYNQFCVIEEASKANE 
VLENLTQGKMCLVPGKTRKLLFKFVAKTED 
VGKXIEITSVDLALGNETGRCWLNWQGGGG 
DAASSQEALQAARSFKRRPKLPDNEVH WGS II 
IQASTMIISRVPNISVHLLHEPPALTOEMYCLV 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond) 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C=Cy stein e, 
I>=Aspartic Acid. E=HjIutamic Acia, 
F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIme, 
Q=Glutamine, R-Arginrne, S=Serine, 
T=Threonine 3 V^Valine, W=Tryptophan, 
Y=Tyrosine, X"=Unkno\vn, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














VTVQSHEKTQIRDVKJLTAGLKPGQDANLTQK 

THVTLHGTELCDESYPALLTDIPVGDLHPGEQ 

LEKMLYVRCGTVGSRMFLVYVSYL1NTTVEE 

JsJilVCKCJrtKXJb J V 1 lb I vtrrU VA Vlvr Vol ttr 

EHLERVYADIPFLLMTDLLSASPWALTIVSSE 

LHLAPSMTTVDQLESQVDNV1LQTGESASECF 

CLQCPSLGNIEGGVATGHYIISWKRTSAMENI 

PIITTVITLPHVIVENIPLHVNADLPSFGRVRES 

LPVKYI ILQNKTDLV QD VEIS VEPSD AFMF SG 

LKQIRLRILPGTEQEMLYNFYPLMAGYQQLPS 

LN[NLLRFPNFTNQLLRRFIPTSIFVKPQGRLM 

DDTSIAAA 


764 


2114 


A 


6093 


1 


1422 


AAADLANSNAGAAVGRKAGPRSPPSAPAPAP 

PPPAPAPPTLGNNHQESPGWRCCRPTLRERN 

ALMFNNELMADVHFVVGPPGATRTVPAHKY 

VLAVGSSVFYAMFYGDLAEVKSEIHIPDVEPA 

AFLILLKYMYSDEIDLEADTVLATLYAAKKYI 

VPALAKACVNFLETSLEAKNACVLLSQSRLF 

EEPELTQRCWEVIDAQAEMALRSEGFCEIDR 

QTLEIIVTREALNTKEAVVFEAVLNWAEAEC 

KRQGLPITPRNKRHVLGRALYLVRIPTMTLEE 

FANGAAQSDILTLEETHSIFLWYTATNKPRLD 

FPLTKRKGLAPQRCHRFQSSAYRSNQWRYRG 

RCDSIQFAVDRRVFIAGLGLYGSSSGKAEYSV 

KIELKRLGVVLAQNLTKFMSDGSSNTFPVWF 

EHPVQVEQDTFYTASAVLDGSELSYFGQEGM 

TEVQCGKVAFQFQCSSDSTNGTGVQGGQIPE 

LIFYA 


765 


2115 


A 


6099 


1 


1150 


SGFTH YAI YDFI VKG SCFCNVHADQCIPVHGF 

RPVKAPGTFHMVHGKCMCKHNTAGSHCQH 

CAPLYNDRPWEAADG KTG APNECRTCKCNG 

HADTCHFDVNVWEASGNRSGGVCDDCQHN 

TEGQYCQRCKPGFYRDLRRPFSAPDACKPCS 

CHPVGSAVLPANSVTFCDPSNGDCPCKPGVA 

GRRCDRCMVGYWGFGDYGCRPCDCAGSCD 

PITGDCISSHTDIDWYHEVPDFRPVHNKSEPP 

WEWEDAQGFSALLHSGKCECKEQTLGNAKA 

rUUMKYo Y VLlSJKJL-oArlUNlj i rl VJb VIN Vivliv 

KVLKSTKLKJFRGKRTLYPESWTDRGCTCPIL 

NPGLEYLVAGHEDIRTGKJLIVNMKSFVQHWK 

PSLGRKVMDILKRECK 


766 


2116 


A 


6103 


2 


384 


MTAAATATVLKEGVLEKRSGGLLQLWKRKR 
CVLTERGLQLFEAKGTGGRPKELSFARIKAVE 
C Vbo 1 OKxil i r 1 L V 1 bOuObl Ur KLrLfcUrU W 
NAQITLGLVKFKNQQAIQTVRARQSLGTGTL 
VS 


767 


2117 


A 


6106 


1 


542 


bGobHAoiXjour ybbKlLobUt^ 1 rLlAOMLaLr 
MARYYIIKYADQKALYTRDGQLLVGDPVAD 
NCCAEKJCTLPNRGLDRTKVPIFLGIQGGSRC 
LACVETEEGPSLQLEDVNIEELYKGGEEATRF 

TFFOC.C.^fi^AFRI FAAAWPOWFI CGPAEPOO 
PVQLTKESEPS ARTKF YFEQSW 


768 


2118 


A 


6109 


3 


292 


FILQAVLQLSSQEARYKAFGTCVSHIGAILAF 

YTPSVISSVMHRVARCAAPHVH1LLANFYLLF 

PPMVNPIIYGVKTKQIRDSLGSIPEK.GCVNRE 


769 


2119 


A 


6110 


1 


711 


RHEPSCSNGVASTKSKQNHSKYPAPSSSSSSS 
SSSSSSSPSSVNYSESNSTOSTKSQHHSSTSNQ 
ETSDSEMEMEAEHYPNGVLGSMSTRIVNGAY 
KHEDLQTDESSMDDRHPRRQLCGGNQAATE 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


PrerfirtpH pnH 

* IvUIHCU Villi 

nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino 5*f*iH cpni i pnpp ( AszAlfiniriR /"* — ~ 

D^Aspartic Acid. E=GlutamicAcid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M = Methioninew N = AsnaTa^ine P=Pm1inp 

Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, Y=possible 
nucleotide insertion 














RIILFGRELOALSEOLGREYGKNI AHTFN/TT on 
AFSLLAYSDPWSCPVGQQLDPIQREPVCAAL 
NSAILESQNLPKQPPLMLALGQASECLRLMA 
RAGLGSCSFARVDDYLH 


770 


2120 


A 


6125 


2 


570 


YFGLNLHVQHLGNNVFLLQTLFGAVILLANC 
VAPWALKYMNRRASQMLLMFLLAICLLAIIF 
VPOEMOMI REVLATT GT GA^AT A NTT AFAT-T 

GNEVIPTIIRARAMGINATFANIAGALAPLMM 
ILSVYSPPLPWIIYGVFPFISGFAFLLLPETRNK 
PLFDTIQDEKNERKDPREPKQEDPRVEVTQF 


771 


2121 


A 


6126 


909 


353 


RSFVLDTASAICNYNAHYKNHPKYWCRGYF 
RDYCNHAFSPNSTNHVALRDTGNQLIVTMSC 
LTKEDTG WYWCGIQRDFARDDMDFTELI VT 
DDKGTLANDFWSGKDLSGNKTRSCKAPKW 
RKADRSRTSILIICILITGLGnSVISHLTKRRRS 

ORNRRVfTMTT VPPQPVT TPlfFlurAPTPrvuf 
v^rviN jtviv v VJ1N i LAxr otv VL1 rJ\JlJVl/Vr 1 Il^lVl 


772 


2122 


A 


6148 


7 


810 


FVLGILALSHTISPFMNKFFPASFPNRQYQLLF 

TQGSGENKEEIINYEFDTKDLVCLGLSSIVGV 

WYLLRKHWIANNLFGLAFSLNGVELLHLNN 

VSTGCILLGGLF1 YDVF WVFGTN VM VT VAK S 

FEAPIKLVFPQDLLEKGLEANNFAMLGLGDV 

VTPGIFIALLLRFDISLKKNTHTYFYTSFAAY1F 

GLGLTIFIMHIFKHAQPALLYLVPACIGFPVLV 

ALAKGEVTEMFSYEESNPKDPAAVTESKEGT 

EASASKGLEKKEK 


773 


2123 


A 


6161 


3 


1088 


CQPMLVTRKNHPKLLLRRTESVAEKMLTNW 
FTFLLYKFLKESAGEPLFMLYCAIKHQMEKG 

PIF1 ATTflJ? AT? VQT Ct?nVT Tool TTWVTT TT \tp\/ 

NPENENAPEVPVKGLDCDTGTQAKEKLLDA 

AYKGVPYSQRPKAADMDLEWRQGRMARJIL 

QDEDVTTKIDNDWKRLNTLAHYQVTDGSSV 

ALVPKQTSAYNISNSS'l'Fi'KSLSRYESMLRTA 

SSPDSLRSRTPMITPDLESGTKLWHLVKNHDH 

LDQREGDRGSKMVSEIYLTRLLATKGTLQKF 

VDDLFETIFSTAHRGSALPLAIKYMFDFLDEQ 
ADKHOIHDADVRHTWKS'NPT PI RFWVMVTV 

NPQFVFDIHKNSrrDACLSW 


774 


2124 


A 


6163 


860 


125 


KTAVKKRNLNPVFNETLRYSVPQAELQGRVL 

SLSVWHRESLGRNIFLGEVEVPLDTWDWGSE 

PTWLPLQPRVPPSPDDLPSRGLLALSLKYVPA 
GSEGAGLPPSGFT HKWVKFAPm T PT R AHQT 

DTYVQCFVLPDDSRASR(^TRWRRSLSPVF 
NHTMVYDGFGPADLRQACAELSLWDHGALA 
NRQLGGTRLSLGTGSSYGLQVPWMDSTPEEK 
QLWQALLEQPCEWVDGLLPLRTNLAPRT 


775 


2125 


A 


6191 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAAD 

YVRSKDFRDYLMSTHFWGPVANWGLPIAAIT 

DMK\KSPEnSRRMTFAL*CYSLTFVRFAHYVQ 

VPWNWIJr^tXK^HTAVDFDQLISSMPCISHGMT 

ASASAL 


776 


2126 


A 


6217 


1 


827 


FRGYWGVREAF"! DAS WSGGLGPGKPGMKIT 

RQKHAKKHLGFFRNNFGVREPYQILLDGTFC 

QAALRGRJQLREQLPRYLMGETQLCTTRCVL 

KELETLGKDLYGAKLIAQKCQVRNCPHFKNA 

VSGSECLLSMVEEGNPHHYFVATQDQNLSVK 

VKKKPGVPLMFIIQNTMVLDKPSPKTIAFVKA 

VESGVRLSQCMRKK V SNISKRNR V* * KTLNRG 

RRKKRKKJSGPNPLSCLKKKKKAPDTQSSASE 

KKRKRKRIRNRSNPKVLSEKQNAEGE 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

IDNO: 

in 

USSN 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C^ysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N-Asparagine, P-Proline, 
v^=vjiutarnine, K— Argimne, i>— oerincj 
T=Threonine. V=Valine 1 W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possib!e 
nucleotide insertion 


111 


2127 


A 


6236 


1038 


1402 


YYQISSLPSIVGNGIFLWLUCIFLAKQGGSRL* 

YNSKISPALWGPPVIPSALGGEAGKSL*PRRQ 
RFQRGGIAPLPSRVRGRAKLFLKKK 


778 


2128 


A 


6237 


422 


913 


ASFFHHHRGAFLLLLAIPGS*GQDQSLIHWSN 
AVSNAD\LLDLK\N*LDH\LEEKMPL\EVKVVP 
r^VLwtl N'KouuLroArorfcVITW 1 UH VlSJr/ 

SPQRDGG AL G\QGPLGI PSDSI LALLKKQT*RA 
LLNWPLGSLRRSSCFGGQDGQDLKPRSGLGC 
NSFRYRR 


779 


2129 


A 


6249 


420 


36 


ARAPSPSFSVRDVELSDPARERGEMPVAVGP 

YGQSQPSCFDRVKMGFVMGCAVGMAAGAL 

FGTFSCLSSILVSSSG/SGMRGRELMGG1GKTM 

MQSGGTFGTFMAIGMGIRC*P\VLPTTSVPSH 

QSQPMY 


780 


2130 


A 


6263 


415 


1380 


R1MRMCDRGIQMLITTVG AFAAF SLMTI A VG 

TDYWLYSRGVCRTKSTSDNETSRKNEEVMT 

HSGLWRTCCLEGAFRGVCKKIDHFPEDADYE 

QDTAEYLLRAVRASSVFPILSVTLLFFGGLCV 

AASEFHRSRHNVILSAGIFFVSAGLSNIIGIIVYI 

S\ANAGRTPGQR\DSKKSYSYGWSF/YFSGAFS 

FIIGR/IIC*GVGLPWHIYIEKHQQLRAKSHSEF 

LKKSTFARLPPYRYRFRRRSSSRSTEPRSRDLS 

PISKGFHTIPSTDISMFTl^RDPSKJTMGTLLNS 

DRDHAFLQFHNSTPKEFKESLHNNPANRRTT 

PV 


781 


2131 


A 


6274 


832 


318 


RIIKVKDLKQTLAIKTAYPRCKCLVEMDQIFH 

LQVKQKQLACLCTWQARDPDCPPSTKWL/L 

VGPGMGCMVALFQDSIAWSNKSMPSSLSAIS 

QSPCQVQAPEGPSSFHLPTLSFTTCLSWQGGD 

LEFLGDLKGCSELKNFQELITQSALVHPKADV 

WWYCGRPLLGTLPSN 


782 


2132 


A 


6281 


1324 


393 


WISLPSSLLCRKNGSSAEDDRRVGEPSAEEAEG 

EREDWGIGSA*SVGAVSK.VPSARF*RTYPS\E 

DEEEVTHQKSSSSDSNSEEHRKKKTSRSRNK 

KKRKNKSSKRKHRKYSDSDSNSESDTNSDSD 

DDKKJIVKAKKKKKKKKHKTKKKKNKKTKK 

ESSDSSCKDSEEDLSEATWMEQPNVADTMDL 

IGPEAPIIHTSQDEKPLKYGHALLPGEGAAMA 

EYVKAGKRIPRRGE1GLTSEEIGSFECSGYVM 

SGSRHRRMEAVRLRKENQIY SADEKRALASF 

NQEERRKRESKILASFREMVHKKTKGKDDK 


783 


2133 


A 


6305 


201 


1032 


WDD YPQGALRRRE AAEGLHFLGPP GR VRGQ 
LRGITGPAWYCHSPSHSLLSAFCHLPTPSRCP 
AMARPPVPGSWVPNWHES/RRGQGVPGLHS 
AQEPPAGVWAA*AASAAAA\LS1DTASYKIFV 
SGKSGVGKTALVAKLAGLEVPWHHETTGIQ 
TTWFWPAKLQASSRWMFRFEFWDCGESA 
LKKFDHMLLACMENTD AFLFLFSFTDRA SFE 
DLPGQL AR1AGE A PGWRMV IG SKFDQYMHT 

D V rcKUL 1 At Kv^A W liLr LrL.K V Js.o V r vjivKJLO 


784 


2134 


A 


6308 


86 


96 


GSSPDPASLITMKNQDKKNGAAKQSNPKSSP 
GQPEAGPEGAQERPSQAAPAVEAEGPGSSQA 
PRKPEGAQARTAQSGALRDVSEELSRQLEDIL 
STYCVDNNQGGPGEDGAQGEPAEPEDAEKSR 
TYVARNGEPEPTPVVNGEKEPSKGDPNTEE1R 
QSDEVGDRDHRRPQEKKKAKGLGKEITLLM 
QTLNTLSTPEEKLAALCKKY AELL EEHRNSQ 
KQMKLLQKKQ SQL VQEKDHLRGEH SKAVLA 
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1 SEQID 
JNU; oi 
nucl- 
eotide 
seq- 
uence 


SEQID 
nu: oi 
peptide 
seq- 
uence 


Met 
nod 


SEQ 
ID NU: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteme, 
D=Aspartic Acid, E-GIutamic Acid, 
^Phenylalanine, G=Glycine, HHHistidine, 
I=IsoIeucine, K=Lysine, L^Leucine, 
M=Methionine,N=Asparagine, P=ProIine, 
Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, ^Tryptophan, 
Y=Tyrosine, X=Unkno\vn, *=Stop codon, 
/=possible nucleotide deletion, \~possible 
nucleotide insertion 














RSKLESLCRELQRHNRSLKEEGVQRAREEEE 

KRKEVTSHFQVTLND1QLQMEQHNERNSKLR 

QENMELAERLKKLIEQYELREEHIDKVFKHK 

DLQQQLVDAKLQQAQEMLKEAEERHQREKD 

FLLKEAVESQRMCELMKQQETHLKQQLALY 

TEOEEFQNTLSKSSEWTTFKQEMEKMTKKI 

KJCLEKETTMYRSRWESSNKALLEMAEEKTV 

RDKhLEGLQViUQRLEKJ^CRALQT/GAQ^PVR 

GQRWGSHRTSAVRIFS 


785 


2135 


A 


6319 


1493 


889 


SPQGPLLRSVSPVSAGASSVTPGGAQPGVTTT 

PPSLVAVAPAPGSAAGPAAGWQ*HAGCR/WT 

KLPWSWGMRPMKIFFSEEYRSISTRISHDAL* 

EKCTQPAKPLSMIR\TGSSVSPG/PLVKWN\VT 

RREFRNSGTRWSSCCGMSCMYSFLGHCSV/S 

QDLPLVHVDVGWQPPLGPTVGLRPGLLPLHD 

TTPCQKXVVDDLDWA 


786 


2136 


A 


6320 


551 


135 


RWLPV AECD SSC VGCTGEGPGNCKECI SGYA 
REHGQCADVDECSLAEKTCVRKNENCYNTP 
GSYVCVCPDGFEET/RRCLCAAGRG* SHRRRK 
PDTAALPRRPVMCRTYPLNYSEGCPVENVAL 
RMPSPAVDSGGERLPAL 


787 


2137 


A 


6330 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNLMNAIMG 

SGILGLAYVMANTGVFGFSFLLLTVALLASYS 

VHLLLSMCIQTAYLGP*TNYFMVLPAH*LTCL 

PLIEFLQSL*NSL\*AVTSYEDLGLFAFGLPGKL 

WAGTIIIQNIGAMSSYLLIIKTELPAAIAEFLT 

GDYSRYWYLDGQTLLIIICV GIVFPLALLPKIG 

FLGYTSSLSFFFMMFFALVVIIKKWSIPCPLTL 

NYVEKGFQISNVTDDCKPKLFHFSKESAYALP 

TMAFSFLCHTS ILPIYCELQSPSKKRMQN VTN 

TAIALSFLIYFISALFGYLTFYD/GTTKAQRGE 

VTCHRIKDKVESELLKG***IP*SHDVWMT\V 

KLCILFAVLL\TVPLIHFPARKAVTMMFFSNFP 

FSWIRHFLITLAL1MIIIVLLAIYVPDIRNVFGW 

GASTSTCLIFI FPGLF YLKLSREDFLSWKKLGV 

GCFC/LLSFKTSILRNSLSVYHLPASRKSIYFKI 


788 


2138 


A 


6351 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTM 

SASFVPNGASLEDCHCNLFCLADLTGIKWKK 

YVWQGPTSAPILFPVTEEDPILSSFSRCLKADV 

LG/VWRRDQRPERREVL* IFWGGEDPWLLTLF 

TNflTQKJCKMECGRMDFPMNAVLCFSKAVH 

NLLERCLMNRNFVRIGKWFVKPYEKDEKPIN 

KSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 

LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQ 

AFKMSDSATKKLIGEWKQFYPISCCLKEMSE 

EKQEDMDWEDDSLAAVEVLVAGVRMIYPAC 

FVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPAS 

TRDPAM SS VTLTPPTS PEEVQTVDPQS VQKW 

VKFSSVSDGFNSDSTSHHGGKIPRKLANHW 

DRVWQECNMNRAQNKRKYSASSGGLCEEAT 

AAKVASWDFVEATQRTNCSCLRHKNLKSRN 

AGv^OQArbLGC^QQQlLrKiiKTNEKQEKSl^ 

PQKRPLTPFHHRVS V SDDVGMD\ADS\ASQRL 

V\ISAP\DSQ\VRFSNIR\TNDVAK\TPQMHGTE 

MANSPQPPPLSP\HPCDVVDEGVTKTPSTPQS 

QHFYQMPTPDPLVPSKPMEDRIDSLSQSFPPQ 

YQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESV 

TSVTELMVQCKKPLKVSDELVQQYQIKNQCL 

SAIASDAEQEPKIDPYAFVEGDEEFLFPDKKD 
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nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=PhenylaJanine, OGlycine, H=Histidine, 
I=Isoleucine, K=Lysine, L-Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W^ryptophan, 
Y-Tyrosine, X^Unknovvn, *=Stop codon, 
/=possibIe nucleotide deletion, \=possib!e 
nucleotide insertion 








i 
i 

i 

. 






RQNSEREAGKKHKVEDGTSSVTVLSHEEDA 

MSLFSPSIKQDAPRPTSHARPPSTSLIYDSDLA 

VSYTDLDNLFNSDEDELTPGSKRSANGSDDK 

ASCKESKTGNLDPLSC1STADLHKMYPTPPSL 

EQHIMGFSPMNMNNKEYGSMDTTPGGTVLE 

GNSSS1GAQFKIEVDEGFCSPKPSEIKDFSYVY 

KPENCQILVGCSMFAPLKTLPSQYLPLIKLPEE 

aYRQSWTVGKLELLSSGPSMPFlKEGDGSNM 

DQEYGTAYTPQTHTSCGMPPSSAPPSNSGAGI 

LPSPSTPRFPTPRTPRTPRTPRGAG GPASAQGS 

VKYENSDLYSPASTPSTCRPLNSVEPATVPSIP 

EAHSLYVNLILSESVMNLFICDCNSDSCC1CVC 

NMNIKGADVGVYIPDPTQEAQYRCTCGFSAV 

MNRKFGNNSGLFFEDELDDGRNTDCGKEAE 

KRFEALRATSAEHVNGGLKESEKLSDDLILLL 

QDQCTNLFSPFGAADQDPFPKSGV1SNWVRV 

EERDCCNDCYLALEHGRQFMDNMSGGKVDE 

AL VKSSCLHPWSKRNDVSMQCSQDILRMLLS 

LQPVLQDAIQKKRTVRPWGVQGPLTWQQFH 

KMAGRGSYGTDESPEPLPIPTFLL G YD YDYL V 

LSPFALPYWERLMLEPYGSQRDIAYVVLCPE 

NEALLNGAKSFFRDLTAIYESCRLGQHRPVSR 

LLTDG1MRVGSTASKKLSEKLVAEWFSQAAD 

GNNEAFSKLKLYAQVCRYDLGPYLASLPLDS 

SLLSQPNLVAPTSQSLITPPQMTNTGNANTPS 

ATLASAASSTMTVTSGVAISTSVATANSTLTT 

ASTSSSSSSNLNSGVSSNKLPSFPPFGSMNSNA 

AG SM STQ ANTVQSGQL GGQQTS ALQTAGI S G 

E SS SLPTQPHPD VSE STMDRDK VGIPTDG DSH 

AVTYPPAIVVYnDPFTYENTDESTNSSSVWTL 

GLLRCFLEMVQTLPPHIKSTVSVQIIPCQYLLQ 

PVKHEDREIYPQHLKSLAFSAFTQCRRPLPTS 

TNVKTLTGFGPGLAMETALRSPDRPECrRLYA 

PPFILAPVKDKQTELGETFGEAGQKYNVLFV 

GYCLSHDQRWILASCTDLYGELLETCIINIDVP 

NRARRKKSSARKFGLQKLWEWCLGLVQMSS 

LPWRVVIGRLGRIGHGELKDWSCLLSRRNLQ 

SLSKRLKDMCRMCGISAADSPSILSACLVAM 

EPQGSFVIMPDSVSTGSVFGRSTTLNMQTSQE 

NTPQDTSCTHILVFPTSASVQVASATYTTENL 

DLAFNPNNDGADGMGIFDLLDTGDDLDPDII 

NILPASPTGSPVHSPGSHYPHGGDAGKGQSTD 

RLLSTEPHEEVPNILQQPLALGYFVSTAKAGP 

LPD WFWSACPQAQYQCPLFLKASLHLHVPS V 

QSDELLHSKHSHPLDSNQTSDVLRFVLEQYN 

ALSWLTCDPATQDRRSCLPIHFWLNQLYNFI 
MNML 


789 


2139 


A 


6359 


I 


2002 


TGTLTEDGLDVMGVVPLKGQAFLPLVPEPRR 

LPVGPLLRALATCHALSRLQDTPVGDPMDLK 

MVESTGWVLEEEPAADSAFGTQVLAVMRPP 

L WEPQLQAMEEPP VPVS VLHRFPFSS ALQRM 

SWVAWPGATQPEAYVKGSPELVAGLCNPET 

VPTDFAQMLQSYTAAGYRVVALASKPLPSVP 

SLEAAQQLTRDTVEGDLSLLGLLVMRNJLLKP 

QTTPVIQALRRTRIRAVMVTGDNLQTAVTVA 

RG CGMVAPQEHLD VHATHPERGQPASLEFLP 

MESPTAVNGVKDPDQAASYTVEPDPRSRHLA 

LSGPTFGIIVKHFPKLLPKVLVQGTVFARMAP 

EQKTELVCELQKLQYCVGMCGDGANDCGAJL 

KAADVGISLSQAEASVVSPFTSSMASIECVPM 
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to last amino 
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Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E^ulutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R=Arginine, S=Serine, 
T^Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *-Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














VlKbuKL-oLiJl oroVrKYMALYbH QrloVLIL 

YTlNTNLGDLQFLAIDLVrrnVAVLMSRTGP 

ALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNY 

ENTWFSLSSFQYLILAAAVSKGAPFR\RPLTN 

NVPFLLASAL* SSVL WLVLSPGLLHGPL ALR 

NITDTGFKLLLVGLVTLNFVGGLHAGERARP 

VPPRLPAPPPAQAGVSKKRFKQLERELAEQPW 

PPLPAGPLR 


790 


2140 


A 


6380 


76 


1059 


S SAG S ARKLQ VMAL AARL WRLLPFRRG AAP 

GSRLPAGTSGSRGHCGPCRFRGFEVMGNPGT 

FKRGLLLSALSYLGFETYQVISQAAWHATA 

KVEEILEQADYLYESGETEKLYQLLTQYKESE 

DAELLWRLARASRDVAQLSRTSEEEKKLLVY 

EALEYAKRA/L/EKNESSFASHKWYAICLSDV 

GDYEGIKAKIANAYIIKEHFEKAIELNPKDATS 

IHLMGIWCYTFAEMPWYQRRIA+NACLQLPP 

*FPPYEKALG\YFHRAEQVDPNFYSKNLLLLG 

KTYLKLHNKKLAAFWLMKAKDYPAHTEED 

KQIQTEAAQLLTSFSEKN 


791 


2141 


A 


6434 


3 


1460 


IALLIVDGLAWDDQGGLALLHISPSKLIL*QDS 

SGMS/YVM VRCTITRAFFKSLLCHICQ Y SIG PQ 

*VT\CPGQDACKE*KSTAN*GG*RE**PQVLFF 

AFLSNPAVKFGRMSKKQRDSLYAEVQKHQQ 

RLQEQRQQQSGEAEALARVYSSSISNGLSNLN 

NETSGTYANGSVIDLPKSEGYYNWSGQPSP 

DQSGLDMT\GIKQIKQEPIYDLTSVPNLFTY\SS 

FNMGQLAPGITVMTEIDRIAQNHKSHLETCQY 

TMEELHQLAWQTHTYEE1KAYQSKSREALW 

QQCAIQITHAIQ YWEFAKRI TGFMELCQNDQ 

ILLLKSGCLEWLVRMCRAFNPLNNTVLFEG 

KYGGMQMFKALGSDDLWEAFDFAKNLCSL 

QLTEEEIALFSSAVLISPDRAWLIEPRKVQKLQ 

EKJYFALQHVIQKNHLDDETLAKLIAKIPTITA 

VCNLHGEKLQVFKQSHPEIVNTLFPPLYKELF 

NrDCA 1 ACK. 


792 


2142 


A 


6440 


92 


781 


SRGTFRCFCRDFFPCFSNMRLFLWNAVLTLFV 

TSLIGALIPEPEVKIEVLQKPFICPiRKTKGGDL 

MLVHYEGYLEKDGSLFHSTHKHNNGQPIWFT 

LGILEALKGWGPGA*K/DMCVGEKRJCLIIPPA 

LGYGKEGKGKIPPESTLIFNIDLLEIRNGPRSH 

ESFQEMDLNDDWKLSKDEVKAYLKXEFEKH 

GAVVNESHHDALVEDIFDKEDEDKDGFISAR 

EFTYKHDEL 


793 


2143 


A 


6446 


3201 


152 


PRLKRLVVTEEDGGARPEALGKIAPRTPAELG 

ARADQELVTALMCDLRRPAAGGMMDLAYV 

CEWEKWSKSTHCPSVPLACAWSCRNLIAFTM 

DLRSDDQDLTRMIHU.DTEHPWDLHSIPSEHH 

EAITC\LEWDQSGFPGFLFSRWPTGQIK\CWS 

MGVSTXA\NSWE\SSVGSL\VEGGPHLWALS\ 

WLH\NGVKLALHVEKSGASSFGEKFSR\VKFS 

SGQVL\TSTAESLCRLRARVALADIAFTGGGNI 

WATADGSSA\SPVQFYKVCVSVVSEKCRIDT 

DILPSLFMRCTTDLNRKDKFPAITHLKFLARD 

MSEQVLLCASSQTSSIVECWSLRKEGLPVNNI 

FQQISPVVGDKQPTILKWR1LSATNDLDRVSA 

VULPKLPISLTNTDLKVASDTQFYPGLGLAL 

AFHDGSVHIVHRLSLQTMAVFYSSAAPRPVD 

EPAMKRPRTAGPAVHLKAMQLSWTSLALVG 
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of peptide 
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Amino acid sequence (A=Alanine C-Cysteine, 

rt= A cnartif* AfiH P=^"rliif!imitf* Arifl 

F=Phenylalanine, GKjlycine, H=Histidine, 
I=Isoleucine, K=Lysine f L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
^possible nucleotide deletion, \=possible 
nucleotide insertion 














IDSHGKLSWLRLSPSMGHPLEVGLALRHLLFL 

LEYCMVTGYDWWDILLHVQPSMVQSLVEKL 

HEEYTRQTAALQQVLSTRILAMKASLCKLSP 

CTVTRVCDYHTKLFLIAISSTLKSLLRPHFLNT 

PDKSPGDRLTEICTKITDVDIDKVMINLKTEEF 

VLDMNTLQALQQLLQWVGDFVLYLLASLPN 

QPCPTSEPCPTSEPSPTSEPSPTSEPSSP*SLC\G 

SLLRPGHSFLRDGTSLGMLRELMVVIRIWGLL 

RDEGPASEPDEALVDECCLLPSQLLIPSLDWL 
PASDHT VSRT OPKOPT RT OFGRAPTI PGSAAT 

LQLDGLARAPG QPKIDHLRRLHLG ACPTEEC 

KACTRCGCVTMLKSPNRTTAVKQWEQRWIK 

NC/LVRWALVAGAPQLPLSPAAPQLLLSYPSA 

APEPGCCKSHRSPWTLLGAVNLSPPCRAVEG 

RGPDACVTSRASEEAPAFVQLGPQSTHHSPRT 

PRSLDHLHPEDRP 




">1 AA 


A 

A 




416 


Jo J 


MnniCAni rmf^ppaovi mpvvpat wfafog 
GSIEPRDLRLQ*AVITPL\TPAWVTQ 


795 


2145 


A 


6499 


395 


1027 


KLLWLPPHSEQKRSPLYHPQGPSGTTPSAP\FS 

SHSPPPSLLQAVPSIAAFLRTHGHISASGPLRMP 

FPH/H*NAFLLVFPGQRSQLTS/PSHYLCREVFP 

DHHHHLCRLSLESSPLFHHRVLFCVPKQNVN 

STRAQIFCLFVHIVGCRCINTFPLHLFRLHLWL 

HFL QIPLCKKNKSVKLGKTWGRGCQS AAG S 

DTRVRAAVGAPGLPVEPLV 


796 


2146 


A 


6503 


68 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYA 

ULt yr v<JiN ODiilVXDivlr 11J VjJvt v lEJVArr Ar orl V W K 

PAALFLTLLCLLLLIGLGVLASMFHVTLKIEM 

KKMNKLQNISEELQRNISLQLMSNMNISNKIR 

NLSTTLQTIATKLCRELYSKEQEHKCKPCPRR 

WIWHKDSCYFLSDDVQTWQESKMACAAQN 

ASLLKINNKNALEFIKSQSRSYDYWLGLSPEE 

DS/YSWYESG*YNQ\PSAWV1RNAPDLNNMY 

CGYINRLYVQYYHCTYKQRMICEKMANPVQ 

LGSTYFREA 


797 


2147 


A 


6507 


1 


881 


PGSTHASARSQVPRSAGEAAPHSRRPPGLLPH 

APRAASAQLEERMRDPHPGMTLQEGDCRGS 

QTVSLTMGTADSDEMAPEAPQHTfflDVHIHQ 

ESALAKLLLTCCSALRPRATQARGSSRLLVAS 

WVMQIVLGILSAVLGGFFYIRDYTLLVTSGA 

AIWTGAVAVLAGAAAFIYEKRGGTYWALLR 

TLLALAAFSTAIAALKLWNEDFRYGYSYYNS 

ACRI SSS SDWNTP APTQSPEE VRRLHLCTSFM 

DMLKALFRTLQAMLLGVWILLLLASLTPLWL 

/SL/RGECSQPKG*VPKKRDQKEMLEVSGI*PG 

STHA S ARSQ VPRSAGEAAPHSRRPPGLLPHAP 

VSLTMGTADSDEMAPEAPQHTHIDVHIHQES 

ALAKLLLTCCSALRPRATQARGSSRLLVASW 

VMOI VLGILSA VLGGFFYIRD YTLLVTS GAAI 

WTGAVAVLAGAAAFIYEKRGGTYAVALLRTL 

LALAAFSTAIAALKLWNEDFRYGYSYYNSAC 

RISSSSDWNTPAPTQSPEEVRRLHLCTSFMDM 

LKALFRTLQAMLLGVWILLLLASLTPLWLYC 

WRMFPTKGVSP 


798 


2148 


A 


6528 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAP 

RFLVAFAYWNHYLSCTSPCSCYRPLCRLNFG 

LNWENLALLVLTYVSSSEDF/TWVPG*GRSG 
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/=possibIe nucleotide deletion, V=possible 
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EVFPEGTGLPLPHSDLPTSWCGHSLQCGSQSS 

FPPAIHENAFIVFIASSLGHMLLTCILWRLTKK 

in v &\^ij\xj\jLjOijt\\jf\jri\\£jr ivivivorx. 1 o vi_,i\j.K.V 

MVRWELSSNGNPGRGVLGLGLGLGNKLRVV 

GQNLGL*HCVWWWETGE*KRWRLQMGEE* 

GVASRRQ*VRNSVRGLVCHNSSAPPMYMGFF 

SPTVFGGGVGG*LHVTFILHPPEVEAAGIPLLL 

GPSLPQRQGREHIWILAAPACAPFHDR*WEP 

REIRPSP*ELGLRGEPTLSYPASCRVIRQPIP* D 

RKSYSWKQRLFIINFISFFSALAVYFRHNMYC 

EAGWTIFAILEYTVVLTNMAFHMTAWWDF 

GNKELLITSQPEEKRF 


799 


2149 


A 


6529 


i 


874 

O f *T 


r r r r v^kiin r lr.rl oU i> V o LLALACUJL Cj W CfcJJWo 

CCLVQGGGDLVDWQTNHGEDEAGGDTDSV 

DEARCKESQQEAQENLREDLCLESFAKDKIL 

QIIEGSEREHEETRTKQAALDGEPLGGGQLTA 

VHLHPSKEQQGQEGGERQRGARTHHWRGW 

EKGRRVRLRPPSGKLRADQPVRKLGGPTPS/T 

ELPGLQPHAPTPHTA/PATPTYSPAPDTPNPPV 

AWKLrLrVtrK I Kv^LUKh-K 1 KJvACrrKJrRPPL 

GLPGDPTGPVTHHAPPVSPTGASGQERRAEP 

GAVSYAHASATK 


800 


2150 


A 


6544 


2 


662 


SAQRWAAVAGRWGCRLLALLLLVPGPGGAS 

EITFELPDNAKQCFYEDIAQGTKCTLEFQVITG 

GHYDVDCRLEDPDGKVLYKEMKKQYDSFTF 

TASKNGTYKFCFSNE\FSTFTHKTVYFDFQVG 

EATHLCFLVR/DRVSALTQMESACVSIHEALKS 

VIDYQTHFRLREAQGRSRAEDLNTRVAYWSV 

GEALILLVVSIGQVFLLKSFFSDKRTTTTRVGS 


801 


2151 


A 


6556 


1 


1319 


TPCMECDCGEGLREPQNLSGSQREPQTEGSM 

DGWRRMPRWGLLLLLWGSCTFGIJTOTTTF 

KJUFLKRMPS1RESLKERGVDMARLGPEWSQP 

MKRLTLGNTTSSVILTNYMDTQYYGEIGIGTP 

r\Jlrls. V VrUl uboN V W VrboiVLSKLYl ACVY 

HKLFDASDSSSYKHNGTELTLRYSTGTVSGFL 

SQDIITVGGITVTQMFGEVTEMPALPFMLAEF 

DGWGMGFIEQAIGRVTPIFDNIISQGVLKED 

VFSFYYNRDSENSQSLGGQIVLGGSDPQHYE 

GNFHYINLIKTGVWQIQMKGVSVGSSTLLCE 

DGCLAL VDTGAS YISG STSSIEKLMEALG AKE 

KRLFDYWKCNEGPTLPPTFLFLLGGKDTPLT 

SADYLFQESYSSKKLSTLAIHAMYIPPPTGPTL 

\ALGATF\IRKFYTEFDRGNNPHGFALAR 


802 


2152 


A 


6567 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSL 

LAVWLLALPVAWGQCNAPEW\LPFARPTNL 

TDEFEFPIGTYLNYECRPGYSGRPFSIICLKNS 

VWTGAKDRCRRKSCRNPPDPVNGMVHVIKG 

IQFGSQIKYSCTKGYRLIGSSSATCIISGDTVIW 

DNETPICDRIPCGLPPTITNGDFISTNRENFHY 

GSVVTYRCNPGSGGRKVFELVGEPSIYCTSND 

DQV GI WSGPAPOCIIPNKCTPPNVENGIL VSD 

NRSLF SLNEWEFRCQPGFVMKGPRRVKCQA 

LNKWEPELPSCSRVCQPPPDVLHAERTQRDK 

DNFSPGQEVFYSCEPGYDLRGAASMRCTPQG 

DWSPAAPTCEVKSCDDFMGQLLNGRVLFPV 

NLQLGAKVDFVCDEGFQLKGSSASYCVLAG 

MESLWNSSVPVCEQIFCPSPPVIPNGRHTGKP 

LEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIR 

CTSDPQGNGVWSSPAPRCGILGH CQAPDHFL 

FAKLKTQTNASDFPIGTSLKYECRPEYYGRPF 
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Amino acid sequence (A=AIanine C=Cysteine, 
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SITCLDNLVWSSPKDVCKRKSCKTPPDPVNG 

NfVHVlTDIQVGSRrNYSCTTGHRLIGHSSAECI 

LSGNAAHWSTKPPICQRIPCGLPPTIANGDFIS 

TNORENFHYGSVVTYRCNPGSGGRKVFELVGE 

PSIYCTSNDDQVGIWSGPAPQCnPNKCTPPNV 

ENGILVSDNRSLFSLNEVVEFRCQPGFVMKGP 

RRVKCQALNKWEPELPSCSRVCQPPPDVLHA 

ERTQRDKDNFSPGQEVFYSCEPGYDLRGAAS 

MRCTPQGDWSPAAPTCEVKSCDDFMGQLLN 

GRVLFPVNLQLGAKVDFVCDEGFQLKGSSAS 

YCVLAGMESLWNSSVPVCEQEFCPSPPVIPNG 

RHTGKPLEVFPFGKAVNYTCDPHPDRGTSFD 

LIGESTIRCTSDPQGNGVWSSPAPRCGILGHC 

QAPDHFLFAKLKTQTNASDFPIGTSLKYECRP 

EYYGRPFS1TCLDNLV W SSPKDVCKRKSCKTP 

PDPVNGMVHVITDIQVGSRINYSCTTGHRLIG 

H SSAECILSGNT AH WSTKPPICQRIPCGLPPTI 

ANGDFISTNRENFHYGSVVTYRCNLGSRGRK 

VFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPN 

KCTPPNVENGILVSDNRSLFSLNEWEFRCQP 

GFVMKGPRRVKCQALNKWEPELPSCSRVCQ 

PPPEILHGEHTPSHQDNFSPGQEVFYSCEPGY 

DLRGAASLHCTPQGDWSPEAPRCAVKSCDDF 

LGQLPHGRVLFPLNLQLGAKVSFVCDEGFRL 

KGSSVSHCVLVGMRSLWNNSVPVCEHIFCPN 

PPAILNGRHTGTPSGDIPYGKEISYTCDPHPDR 

GMTFNLIGESTIRCTSDPHGNGVWSSPAPRCE 

LSVRAGHCKTPEQFPFASPTIPINDFEFPVGTS 

LNYECRPGYFGKMFSISCLENLVWSSVEDNC 

RRKSCGPPPEPFNGMVHINTDTQFGSTVNYSC 

NEGFIU.IGSPSTTO.VSGWIVTWDKKAPiCEII 

SCEPPPTISNGDFYSNNRTSFHNGTWTYQCH 

TGPDGEQLFELVGERSIYCTSKDDQVGVWSS 

PPPRCISTNKCTAPEVENA1RVPGNRSFFSLTEI 

IRFRCQPGFVMVGSHTVQCQTNGRWGPKLPH 

CSRVCQPPPEILHGEHTLSHQDNFSPGQEVFY 

SCEPSYDLRGAASLHCTPQGDWSPEAPRCTV 

KSCDDFLGQLPHGRVLLPLNLQLGAKVSFVC 

DEGFRLKGRSASHCVLAGMKALWNSSVPVC 

EQIFCPNPPAILNGRHTGTPLGDIPYGKEVSYT 

CDPHPDRGMTFNLIGESTIRRTSEPHGNGVWS 

SPAPRCELPVGAACPHPPKIQNGHYIGGHVSL 

YLPGMTISYTCDPGYLLVGKGFIFCTDQGIWS 

QLDHYCKEVNCSFPLFMNGISKELEMKKVYH 

YGDYVTLKCEDGYTLEGSPWSQCQADDRWD 

PPLAKCTSRTHDALIVGTLSGTIFFILLIIFLSWI 

ILKHRKGNNAHENPKEVAIHLHSQGGS S VHP 

RTLQTNEENSRVLP 


803 


2153 


A 


6574 


2 


3233 


HGRSARLAAVPAEAMPGPRRPAGSRLRLLLL 

LLLPPLLLLLRG\SHAGNLTVAWLPLANTSY 

PWSWAVRVGPAVELALAQVKARPDLLPGWT 

VRTVLGSSENALGVCSDTAAPLAAVDLKWE 

HNPAVFLGPGCVYAAAPVGRFTAHWRVPLL 

TAGAPALGFGVKDEYALTTRAGPSYAKLGDF 

VAALHRRLGWERQALMLYAYRPGDEEHCFF 

LVEGLFMRVRDRLNITVDHLEFAEDDLSHYT 

RLLRTMPRKGRVI YI CS SP D AFRTLMLL AL EA 

GLCGEDYVFFHLDIFGQSLQGGQGPAPRRPW 

ERGDGQDVSARQAFQAAK1ITYKDPDNPEYL 

EFLKQLKHLAYEQFNFTMEDGLVNTIPASFH 
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/possible nucleotide deletion, \=possible 
nucleotide insertion 














DGLLLYIQAVTETLAHGGTVTDGENITQRMW 
NRSFQGVTGYLKIDSSGDRETDFSLWDMDPE 
NGAFR V VLNYNGTSQELVAV SGRKLN WPLG 

YPPPDIPKCGFDNEDPACNQDHLSTLEVLALV 

GSLSLLGILIVSFFIYRKMQLEKELASELWRVR 

WEDVEPSSLERHLRSAGSRLTLSGRGSNYGSL 

LTTEGQFQVFAKTAYYKGNLVAVfCRVNRKR 

1ELTRKVLFELKHMRDVQNEHLTRFVGACTD 

PPNICILTEYCPRGSLQDTLENESITLDWMFRY 

SLTNDIVKGMLFLHNGAICSHGNLKSSNCW 

DGRFVLKITDYGLESFRDLDPEQGHTVYAKK 

LWTAPELLRMASPPVRGSQAGDVYSFGIILQE 

IALRSGVFHVEGLDLSPKEUERVTRGEQPPFR 

PSLALQSHLEELGLLMQRCWAEDPQERPPFQ 

QIRLTLRKTORENSSN1LDNLLSRMEQYANNL 

EELVEERTQAYLEEKRKAEALLYQILPHSVAE 

QLKRGETVQAEAFDSVTIYFSDIVGFTALSAE 

STPMQWTLLNDLYTCFDAVIDNFDVYKVET 

IGDAYMWSGLPVRNGRLHACEVARMALAL 

LDAVRSFRIRHRPQEQLRLRIGIHTGPVCAGV 

VGLKMPRYCLFGDTVNTASRMESNGEALUCI 

HLSSVETKAVL\EEFGGFELELRGDVEMKGKG 

KVRTYWLLGERGSSTRG 


804 


2154 


A 

t 


6585 


2 


3837 


DAPGRPPVRLPTMELEDGWYQEEPGGSGAV 

MSERVSGLAGSIYREFERLIVRYDEEWKELIP 

LWAVLENLDSVFAQDQEHQVELELLRDDNE 

QLITQYEREKALRKHAEEKFIEFEDSQEQEKK 

DLQTRVESLESQTRQLELKAKNYADQISILEE 

REAELKKEYNALHQRHTEMIHNYMEHLERT 

KLHQLSGSDQLESTAHSRJRKERPISLGIFPLP 

AGDGLLTPDAQKGGETPGSEQWKFQELSQPR 

SHTSLKDELSDVSQGGSKATTPASTANSDVA 

TIPTDTPLKEENEGFVKVTDAPNKSEISKH1EV 

QVAQETRNVSTGSAENEEKSEVQAIIESTPEL 

DMDKDL SG YKGSSTPTKGIENKAFDRNTESL 

FEELSSAGSGLIGDVDEGADLLGMGREVENLI 

LENTQLLETKNALNIVKNDLIAKVDELTCEK 

DVLQGELEAVKQAKLKXEEKNRELEEELRKA 

RAEAEDARQKAKDDDDSDIPTAQRKRFTRVE 

MARVLMERNQYKERLMELQEAVRWTEMIR 

ASRENPAMQEKKRSSIWQFFSRLFSSSSNTTK 

KPEPPVNLKYNAPTSHVTPSVKKRSSTLSQLP 

GDKSKAFDFLSEETEASLASRREQKREQYRQ 

VKAHVQKEDGRVQAFGWSLPQKYKQVTNG 

QGENKMCNLPVPVYLRPLDEKDTSMKLWCA 

VGWLSGGKTRDGGSVVGASVFYKDVAGLD 

TEGSKQRSASQSSLDKLDQELKEQQKELKNQ 

EELS SLVWICTSTHSATKVLIIDAVQPGNILDS 

FTVCNSHVLCIASVPGARETDYPAGEDLSESG 

QVDKASLCGSMTSNSSAETDSLLGGITWGC 

SAEGVTGAATSPSTNGASPVMDKPPEMEAEN 

SEVDENVPTAEE\ATEATEGNAGSAEDTV\DIS 

QTGVYTEHVFTDPLGWQIPEDLSPVYQSSND 

SDAYKDQISVLPNEQDLVREEAQKMSSLLPT 

MWLGAQNGCLYVHSSVAQWRKCLHSIKLKD 

SILSIVHVKGIVLVALADGTLAIFHRGVDGQW 

DLSNYHLLDLGRPIIIISIRCMTVVHDKVWCG 

YRNK1YVVQPKAMKIEKSFDAHPRKESQVRQ 

LAWVGDGVWVSIRLDSTLRLYHAHTYQHLQ 

DVDIEPYVSKMLGTGKLGFSFVR1TALMVSC 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N~Asparagine, P=ProIine 3 
Q^jlutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan. 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possibIe 
nucleotide insertion 














NRLWVGTGNGVIISIPLTETVILHQGRLLGLR 
ANKTSGVPGNRPGSVIRVYGDENSDKVTPGT 
FIPYCSMAHAQLCFHGHRDAVKFFVAVPGQV 
ISPQSSSSGTDLTGDKGRGHLHRSLVVRRP 


805 


2155 


A 


6605 


469 


2602 


FGRLLWGTAFKSWKMKAPIPHLILLYATFTQ 

SLKVVTKRGSADGCTDWS1DIKKYQVLVGEP 

VRIKCALFYGYIRTNYSLAQSAGLSLMWYKS 

SGPGDFEEPIAFDGSRMSKEEDSIWFRPTLLQ 

DSGLYACVIRNSTYCMKVSISLTVGENDTGL 

CYNSKMKYFEKAELSKSKEISCRDIEDFLLPT 

REPEILWYKECRTKTWRPSIVFKRDTLLIREV 

REDDIGNYTCELKYGGFWRRTTELTVTAPL 

TDKPPKLLYPMESKLTIQETQLGDSANLTCRA 

FFGYSGDVSPLIYWMKGEKFIEDLDENRVWE 

SDIUCILKEHLGEQEVSISLIVDSVEEGDLGNYS 

CYVENGNGRRHASVLLHKRELMYTVELAGG 

LGAILLLLVCLVTIYKCYKIEIMLFYRNHFGA 

EELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGT 

YIEDVARCVDQSKRLIIVMTPNYVVRRGWSIF 

ELETRLRNMLVTGEIKVILIECSELRGIMNYQE 

VEALKHTIKLLTVIKWHGPKCNKLNSKFWKR 

LQYEMPFKRIEP1THEQALDVSEQGPFGELQT 

VSAISMAAATSTALATAHPDLRSTFHNTYHS 

QMRQKHYYRSYEYDVPPTGTLPLTSIGNQHT 

YCNIPMTLINGQRPQTKSSREQNPDEAHTNSA 

ILPLLPRETSISSVIW 


806 


2156 


A 


6614 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSAL 

HSPAHRPPGFSVAQKPFGATYVWSSIINTLQT 

QVEVKKRRHRLKRHNDCFVGSEAVDVIFSHL 

IQNKYFGDVDIPRAKVVRVCQALMDYKVFE 

A VPTK VFGKDKKPTFEDS SCSLYRK n 1PNQD 

SQLGKENKLYSPARYADALFKSSDIRSA SLED 

LWENLSLKPANSPHVN1STTLSPQVINEVWQE 

ETIGRJLLQLVDLPLLDSLLKQQEAVPKIPQPK 

RQSTMVNSSNYLDRGILKAYSDSQEDEWLSA 

AIDCLEYLPDQMWEISRSFPEQPDRTDLVKE 

LLFDAIGRYYSSREPLLNHLSDVIINGIAELLV 

NGKTEIALEATQLLLKLLDFQNREEFRRLLYF 

MAVAANPSEFKLQKESDNRMVVKRIFSKAIV 

DNKNLSKGKTDLLVLFL\MDHQKDVFKIPGT 

L\HKIVS\VKVLMAIQNGRDPNRDAGYIYCQRI 

DQRDYSNITEKTTIDELLYLLKTLDEDSKLSA 

KEKKK\LLGQFYKCHPDIFIEHFGD 


807 


2157 


A 


6615 


4198 


2094 


FGI VGTF ALETDELDS DRDPAIFSLCDFG AMR 

PQILLLLALLTLGLAAQHQDKVPCKM/VKML 

CPDRVDKKVSCQVLGLLQVPSVLPPDTETLD 

LSGNQLRSILASPLGFYTA1RHLDLSTNEISFL 

QPGAFQALTHLEHLSLAHNRLAMATALSAG 

GLGPLPRVTSLDLSGNSLYSGLLERLLGEAPS 

LHTLSLAENSLTRLTRH1FRDMPALEQLDLHS 

NVLMDIEDGAFEGLPRLTHLNLSRNSLTCISD 

FSLQQLRVLDLSCNSIEAFQTAS\QPQAEFQLT 

WLDLRENKLLHFPDLAALPRLIYLNLSNNLIR 

LPTGPPQDSKGIHAPSEGWSALPLSXAPSGNAS 

GRPLSQLLNLDLSYNEIELIPDSFLEHLTSLCFL 

NLSRNCLRTFEARRLGSLPCLMLLDLSHNALE 

TLELGARALG\SLRTLLLQGNALRDLPPYTFA 

NLASLQRLNLQGNRVSPCGGPDEPGP\SGCV\ 

AFSGITSLRSLSLVDNEIELLRAGAFLHTPLTE 
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SEQ ID 

"NTH- /yF 
IN**!, ui 

nucl- 
eotide 
seq- 
uence 


SEQ ID 

XT pi. „f 

OI 

peptide 
seq- 
uence 


Met 
noo 


SEQ 
in 

USSN 

09/496 

914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 

pcpiiuc 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D=Aspartic Acid, E=GIutam]c Acid, 
F=Phenylalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, 5=86006, 
T-Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine s X=Unknown 3 *=Stop codon, 
/==possiDie nucleotide deletion, r^possiolc 
nucleotide insertion 














LDLSSNPGLEVATGAL GGLEASLE VLALQGN 

GLMVLQVDLPCFICLKRLNLAENRLSHLPAW 

TQAVSLEVLDLRNNSFSLLPGSAMGGLETSLR 

RLYLQGNPLSCCGNGWLAAQLHQGRVDVDA 

TQDUCRFSSQEEVSLSHVRPEDCEKGGLKNI 

NLIIILTFILVSAILLTTLAACCCVRRQKFNQQ 

YKA 


808 


2158 


A 


6619 


153 


1852 


FKALSQYIYTNTHLEREAAFEVAILLRRMEEG 

ARHRNNTEKKHPGGGESDASPEAGSGGGGV 

ALKKEIGLVSACGIIVGNIIGSGIFVSPKGVLEN 

AGSVGLALIVWIVTGFITWGALCYAELGVNI 

PKSGGDYFYVKDIFGGLAGFLRLW1AVLVIYP 

TNQAVIALTFSNYVLQPLFPTCFPPESGLRLLA 

AICLLLLTWVNCSSVRWATRVQDIFTAGKLL 

ALALIIIMGIVQICKGEYFWLEPKNAFENFQEP 

DIGLVALAFLQGSFAYGGWNFLNYWTEELV 

DP\YKNL\PRAIFISIP\LVTFVYVFANV/ALYVT 

AMSPQEL\LAS\NAVAVTFGEKLLGVMAWIM 

PISVALSTFGGVNGSLFTSSRLFFAGAREGHLP 

SVLAMIHVKRCTPIPALLFTCISTLLMLVTSD 

MYTLINYVGFINYLFYGVTVAGQIVLRWKKP 

DIPRPIKINLLFPIIYLLFWAFLLVFSLWSEPVV 

CGIGLAIMLTGVPVYFLGVYWQHKPKCFSDFI 

ELLTLVSQKMCVWYPEVERGSGTEEANED 

MEEQQQPMYQPTPTKDKDVAGQPQP 




l i 


A 

A 


oo21 


1041 


223 


QDSRKMLPSTSVNSLVQGNGVLNSRDAARH 
TAGAKRYKYLRRLFRTRQMDFEFAAWQMLY 
LFTSPQ RVYRNFHYRKQTKDQ WARDDPAFL 
VLLSIWLCVSTIGFGFVLDMGFFETIKLLLWV 
VLIDC VGVGLLIATLM WFISNKYL VKRQSRD 
YDVEWGYAFDVHLNAFYPLLVILHFIQLFFIN 
HVILTDTFIGYLVGNTLWLVAVGYYIYVTFL 
GYSVGLLFFSWLPFLKNTVILLYPFAPLILLYG 
LSLALG WNFTHTLCSFYKYRVK 


810 


2160 


A 


6623 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAA 

QQVAEDKTVFDLPDYESINHVVVFMLGTIPFP 

EGMGGSVYFSYPDSNGMPVWQLLGFVTNGK 

PSAIFKISGLKSGEGSQHPFGAMNIVR'rPSVAQ 

IGIS VELLDSMAQQTP VGNAAVS SVDSFTQFT 

QKMLDNFYNFASSFAVSQ/VPDDTQ/RPSEMF 

IPANVVLKWYENFQRRTSTCPSLLENIIWIKIN 

F 


811 

J 

j 


2161 


A 


6627 


18 


3367 


LEGSLNTERAKYYLTITMPHFTVTKVEDPEEG 
AAASISQEPSLADIKARIQDSDEPDLSQNSITG 
. EHSQLLDDGHKKARNAYLNNSNYEEGDEYF 
DKNLALFEEEMDTRPKVSSLLNRMANYTNLT 
QGAKEHEEAENITEGKKKPTKTPQMGTFMG 
VYLPCLQNIFG VILFLRLTW V VGTAG VLQAF 
AI VLICCCCTMLTAI SMSAIATNGV VPAGGSY 
FMISRALGPEFGGAVGLCFYLGTTFAAAMYIL 
GAIEIFLVY1VPRAAIFHSDDALKESAAMLNN 

V/fPWriTAT7I VTf NyfVT \/\/T7Tn\rD"V\/XTVC A CT TTT 

iviKv * \j i/WLi v L,rvi vij v vrioviv i vjVlJvrAoLFrL 

ACVIVSILAIYAGAIKSSFAPPHFPVCMLGNRT 

LSSRHroVCSKTKEINNMTVPSKLWGFFCNSS 

QFFN ATCDEYF VHNN VTSIQGIPG LASG IITEN 

LWSNYLPKGEUEKjPSAKSSDVLGSLNHEYVL 

VDITTSFTLLVGIFFPSVTGIMAGSNRSGDLKD 

AQKSIPIGTILAILTTSFVYLSNVVLFGACIEGV 

VLRDKFGDAVKGNLVVGTLSWPSPWVIVIGS 

FFSTCGAGLQSLTGAPRLLQAiAKDNIIPFLRV 
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SEQID 

nucl- 
eotide 
seq- 
uence 


SEQID 

MO- nf 

peptide 
seq- 
uence 


Met 


SEQ 
ID NO- 

in 

USSN 

09/496 

914 


Predicted 

Vipcrtnnino 

nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
miclftfitide 

JtU vl Liu v 

location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A— Alanine C=Cysteine, 
D=Asnartic Acid. E=GIutamic Acid, 
F^Phenylalantne, G=Glycine, H=Histidine a 
I^soleucine, K=Lysine, L^Leucine, 
M=Methionine, N= Asparaginic, P=ProIine, 
Q=Glutamine, R=Arginine, S^erine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *^Stop codon, 
/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 














FGHSKANGEPTWALLLTAAIAELGILIASLDL 

VAPILSMFFLMCYLFVNLACALQTLLRTPNW 

RPRFRYYHWALSFMGMSICLALMFISSWYYA 

IVAMVIAGMIYKYIEYQGAEKEWGDGIRGLS 

LSAARFALLRLEEGPPHTKNWRPQLLVLLKL 

DEDLHVKHPRLLTFASQLKAGKGLTIVGSVIV 

GNFLENYGEALAAEQTDCHLMEAEKVKGFCQ 

LVVAAKLREGISHLIQSCGLGGMKHNTWM 

GWPNGWRQSEDARAWKTFIGTVRVTTAAHL 

ALLVAKNISFFPSNVEOFSEGNIDVWWIVHDG 

GMLMLLPFLLK\QHK V WRKCSIRFFNTVAQL E 

DKSIQMKKDLATFLYHLRIEAEVEVVEMHDS 

DISAYTYERTLMMEQRSQMLRHMRLSKTER 

DREAQLVKDRNSMLRLTSIGSDEDEETETYQ 

EKVHMTWTKDKYMASRGQKAKSMEGFQDL 

LNMRPDQSNVRRMHTAVKLNEV1VNKSHEA 

KLVLLNMPGPPRNPEGDENYMEFLEVLTEGL 

ERVLLVRGGGSEVIT1YS 


812 


2162 


A 


6628 


66 


640 


AVCTMSEMAELSELYEESSDLQMDVMPGEG 

DLPQMEVGSGSRELSLRPSRSGAQQLEEEGP 

MEEEEAQPMAAPEGKRSLANGFNAGEQPGQ 

VAGADFESEDEGEEFDDWEDDYDYPEEEQLS 

GAGYRVSAALEEADKMFLRTREPALDGGFQ 

MHYEKTPFDQLAFIEELrASLMVVNRLTEELG 

rDFIIDTCF 


813 


2163 


A 


6630 


708 


1355 


AKMGAYKYIQELWRKKQSDVMRFLLRVRC 

WQYRQLSALHRAPRPTRPDKARRLGYKAKQ 

GY/VYIYIGFVFAVlYRiRVRRGGRKRPVPKG 

ATYGKPVHHGVNQLKFARSLQSVAEERAGR 

HCGALRVLNSYWVGEDSTYKFFEVILIDPFHK 

AIRRNPDTQWITKPVHKHREMRGLTSAGRKS 

RGLGKGHKFHHT1GGSRRAAWRRRNTLQLH 

RYR 


814 


2164 


A 


6635 


201 


1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLR 

DSEDRSDSRAAQPAHDSGHGDDESPSTSSGT 

AGTSS VPELPGF YFDPEKKRYFRL1 .PGHNNCN 

PLTKESIRQKEMESKRLRLLQEEDRRKKIARM 

GFNASSMLRKSQLGFLNVTNYCHLAHELRLS 

CMERKKVQIRSMDPSALASDRFNLILADTNS 

DRLFTVNDVTVGGSKYGnNLQSLKTPTLKVF 

MHENLYFTNRKVANSVCWASLNHLDSHILLC 

LMGLAETP GC ATLLPASLFVN SHPAGIDRPG\ 

MLCSFRIPGAWSCAWSLNIQANNCFSTGLSR 

RVLLTOWTGHRQSFGTNSDVLAQQFALMA 

PLLFNGCRSGEIFAIDLRCGNQGKGWKATRLF 

HDSAVTSVRILQDEQYLMASDMAGKIKLWD 

LRTTKCVRQYEGHVNEYAYLPLHVHEEEGIL 

VAVGQDCYTRIWSLHDARLLRTIPSPYPASKA 

DIPS VAFS SRLGGSRG APGLLMA VG QDL YCY 

SYS 


815 


2165 


A 


6643 


659 


3282 


NKNILEVPSARTTRIMGDHLDLLLGVVLMAG 

PVFGIPSCSFDGRIAFYRFCNLTQVPQVLNTTE 

RLLLSFNYIRTVTASSFPFLEQLQLLELGSQYT 

PLTIDKEAFRNLPNLRILDLGSSKIYFLHPDAF 

QGLFHLFELRLYFCGLSDAVLKDGYFRNLKA 

LTRLDLSKNQIRSLYLHPSFGKLNSLKSIDFSS 

NQ1FLVCEHELEPLQGKTLSFFSLAANSLYSR 

VSVDWGKCMNPFRNMVLEILDVSGNGWTV 

DITGNFSNAISKSQAFSLILAHHIMGAGFGFHN 

IKDPDQNTFAGLARSSVRHLDLSHGFVFSLNS 
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SEQ ID 

nucl- 
eotide 
seq- 
uence 


SEQ ID 

peptide 
seq- 
uence 


Met 

nuu 


SEQ 

ID iNw. 

in 

USSN 
09/496 
914 


Predicted 

Beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Aminoacid sequence (A=AJanine OCysteine, 
D=Aspartic Acid, E=<jiutamic Acid, 
F=Phenylalanine, G^GIycine, H=Histidine, 
Msoleucine, K^Lysine, L=Leucine, 
M-Methionine, N^Asparagine, P^ProIine, 
Q=Ghitamine, R=Arginine, S^Serine, 
T=Threonine, V-Valine, W-Tryptophan. 
Y^Tyrosine, X=Unknown. *=Stop codon, 
/^possible nucleotide deletion, \=possibIe 
nucleotide insertion 














RVFETLKDLKVLNLAYNK1NKIADEAFYGLD 

NLQVLNLSYNLLGELYSSNFYGLPKVAYIDL 

QKNHI A1IQDQTFKFLEKLQTLDLRDN ALTT1 H 

FIPSBPDIFLSGNKLVTLPKINLTANLIHLSENR 

LENLDILYFLLRVPHLQILILNQNRFSSCSGDQ 

TPSENPSLEQLFLGENMLQLAWETELCWDVF 

EGLSHLQVLYLNHNYLNSLPPGVFSHLTALR 

GLSLNSNRLTVLSHNDLPANLE1LDISRNQLL 

APNPDVFVSLSVLDITHNKFI CECELSTFIN WL 

NH J iWTIAGPPADIYCVYPDSLSG VSLFSLSTE 

GCDEEEVLKSLKFSLFIVCTVTLTLFLMTILTV 

TKFRGFCFICYKTAQRLVFKDHPQGTEPDMY 

KYDAYLCFSSKDFTWVQNALLKHLDTQYSD 

QNRFNLCFEERDFVPGENRPVANIQDAIWNSR 

KIVCLVSRHFLRDGWCLEAFSYAQGRCLSDL 

NSALIMVWGSLSQYQLMKHQS1RGFVQKQQ 

YLRWPEDLQDVGWFLHKLSQQILKKEKEKK 

KDNN1PLQTVATIS 


816 


2166 


A 


6646 


1 


3811 


RDRAGVRPAGKQHAAAAFYDVGGDRPWDS 

GNTQLPPRNPVKANAMFGAGDEDDTDFLSPS 

GGARLASLFGLDQAAAGHGNEFFQYTAPKQP 

KKGQGTAATGNQATPKTAPATMSTPTILVAT 

AVHAYRYTNGQYVKQGKFGAAVLGNHTTR 

EYRILLYISQQQPVTVARIHVNFELMVRPNNY 

STFYDDQRQNWSIMFESEKAAVEFNKQVC1A 

KCNSTSSLDAVLSQDL1VADGPAVEVGDSLE 

VAYTGWLFQNHVLGQVFDSTANKDKLLRLK 

LGSGKV1KGWEDGMLGMKKGGKRLLIVPPA 

CAVGSEGVIGWTQATDSILVFEVEVRRVK1A 

KDSGSDGHSVSSRDSAAPSP1PGADNLSADPV 

VSPPTSIPFKSGEPALRTKSNSLSEQLAINTSPD 

AVKAKLI SRMAKMGQPMLPILPPQLDSND SE1 

EDVNTLQGGGQPVVTPSVQPSLQPAHPALPQ 

MTSQAPQPSVTGLQAPSAALMQVSSLDSHSA 

VSGNAQSFQPYAGMQAYAYPQASAVTSQLQ 

PVRPLYPAPLSQPPHFQGSGDMASFLMTEAR 

QHNTEIRMAVSKVADKMDHLMTKVEELQKH 

SAGNSMLIPSMSVTMETSMIMSNIQRIIQENER 

LKQEILEKSNRIEEQNDKISELIERNQRYVEQS 

NLMMEKRNNSLQTATENTQARVLHAEQEKA 

KVTEELAAATAQVSHLQLKMTAHQKK ETEL 

QMQLTESLKETDLLRGQLTKVQAKL SELQET 

SEQAQSKFKSEKQNRKQLELKVTSLEEELTDL 

RVEKESLEKNLSERKKKSAQERSQAEEE1DEI 

RKSYQEELDKLRQLLKKTRVSTDQAAAEQLS 

LVQAELQTQWEAKCEHLLASAKDEHLQQYQ 

EVCAQRDAYQQKLVQLQEKSVCFAVCLALQA 

QITALTKQNEQHIKELEKNKSQMSG VEAAAS 

DPSEKVKKIMNQVFQSLRREFELEESYNGRT1 

LGTIMNTIKMVTLQLLNQQEQEKEESS SEEEE 

EKAEERPRRPSQEQSASASSGQPQAPLNRERP 
FSPMVP^FO VVFF A VP7 PPT) A T TTQrkrvr;uT3 u 

KGDSEAEALSEIKDGSLPPELSCIPSHRVLGPP 

TSIPPEPLGPVSMDSECEESLAASPMAAK\PDN 

PSGK\VCVREVAPDGPLQESSTRLSLTS\DPEE 

GDPLALGPESPGEPQPPQLKKDDVTSSTGPHK 

ELSSTEAGSTVAGAALRPSHHSQRS SL SGDEE 

DELFKGATLKALRPKAQPEEEDEDEVSMKGR 

PPPTPLFGDDDDDDDIDWLG 


817 


2167 


A 


6649 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEG 
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SEQ ID 
INLJ: oi 
nucl- 
eotide 
seq- 
uence 


SEQ ID 

ri\J. OI 

peptide 
seq- 
uence 


Met 
ntxi 


SEQ 

rn NO- 
IL/ r*\J. 

in 

USSN 

09/496 

914 


Predicted 

Dcgiuning 
nucleotide 
location 
correspond! 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, 

l_J /AoUcli /AClUj JD VJlUlalJllC fWlU, 

F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=G lutamine, R=Arginine, S=Serine, 
T-Threonine, V= Valine, W=Tryptophan, 
Y^Tyrosme, X=Unknown, *=Stop codon, 
A=possible nucleotide deletion, possible 
nucleotide insertion 














KS/ARNSQLRIVLVGKTG AGKS ATGN SILGRK 
VFH^GTvi AK ^ITK^PFKR^^WXTFTFT WVF> 

Vi nOU InAivul iiuVV^^AJJO rr ivL V V v L/ 

TPGIFDTEVPNAETSKEIIRCILLTSPGPHALLL 

VWLGRYTEEEHKATEKILKMFGERARSFMIL 

IFTRKDDLGDTNLHDYLREAPEDIQDLMDIFG 

DRYCALNNKATGAEQEAQRAQLLGL1QRW 

RENKEGCYTNRMYQRAEEEIQKQTQAMQEL 

HRVELEREKARIREEYEEKIRKLEDKVEQEKR 

KKQMEKKLAEQEAHYAVRQQRARTEVESKD 

GILELIMTALQIASFILLRLFAED 


818 


2168 


A 


6660 


357 


1890 


APSGSWTRVVLTLDPCSLRSRSPRSLLDPGMP 

GISARGLSHEGRKQLAVNLTRVLALYRSILDA 

YIIEFF\TDNLWDTLPCSWQEALDGLKPPQLA 

TMLLGMPGEGEVVRYRSVWPLTLLALKSTA 

CALAFTRMPGFQ1TSEFLENPSQSSRLTAPFR 

KHVRPKKQHEIRRLGELVKKLSDFT/GLHPGC 

RRGLRPG\HLSRFMALGLGLMVKSIEGDQRL 

VERAQRLDQELLQALEKEEKRNPQVVQTSPR 

RLLLTGLHACG\DLSVALLRHFSCCPEWALA 

SVGCCYMKLSDPGGYPLSQWVAGLPGYELP 

YRLREGACHALEEYAERLQKAGPGLRTHCY 

RAALETVIRRARPELRRPGVQGIPRVHELKIEE 

YVQRGLQRVGLDPQLPLNLAALQAHLAQEN 

RWAFFSLALLLAPLVETLDLLDRLLYLQEQA 

LSPVGFHAELLPIFSPELSPRNLVLVATKMPLG 

QALSVLETEDS 


819 


2169 


A 


6661 


65 


2686 


SGSGHCLAEAASMGPWGWKLRWTVALLLA 

AAGTAVGDRCERNEFQCQDGKCISYKWVCD 

GSAECQDGSDESQETCLSVTCICSGDFSCGGR 

VNRCIPQFWRCDGQVDCDNGSDEQGCPPKTC 

SQDEFRCHDGKCISRQFVCDSDRDCLDGSDE 

ASCPVLTCGPASFQCNSSTCIPQLWACDNDPD 

CEDGSDEWPQRCRGLYVFQGDSSPCSAFEFH 

CLSGECIHSSWRCDGGPDCKDKSDEENCAVA 

TCRPDEFQCSDGNC1HGSRQCDREYDCKDMS 

DEVGCVNVTLCEGPNKFKCHSGECITLDKVC 

NMARDCRDWSDEPIKECGTNECLDNNGGCS 

HVCNDLKIGYECLCPDGFQLVAQRRCEDIDE 

CQDPDTCSQLCVNLEGGYKCQCEEGFQLDPH 

TKACKAVGSIAYLFFTNRHEVRKMTLDRSEY 

TSLIPNLRNVVALDTEVASNRIY WSDLSQRMI 

CSTQLDRAHGVSSYDTVISRD1QAPDGLAVD 

WIHSNIYWTDSVLGTVSVADTKGVKRKTLFR 

ENGSKPRAIWDPVHGFMYWTDWGTPAK1K 

KGGLNGVDIYSLVTENIQWPNGITLDLLSGRL 

YWVDSKLHSISS1DVNGGNRKTILEDEKRLAH 

rr&L,J\v r&LJlf^vr W 1 IJllJNilAIr oAJNtvLt J uouv 

NLLAENLLSPEDMVLFHNLTQPRGVNWCERT 
TLSNGGCQYLCLPAPQINPHSPKFTCACPDGM 
LLAR\DMRSCLTEG\EAAVATQETSTVRLKVS 
^TAVRTOHTTTT^PVPnT^RI PGATPG1 TTVPT 

VTMSHQALGDVAG\RGN\EKKPSSVRALSIVL 
PIVXLLVFLCLG VFLL WKN WRLKNIN SINFDNP 
VYQICTTEDEVHICHNQDGYSYPSRQMVSLED 
DVA 


820 


2170 


A 


6666 


17 


4146 


ERGISSQIKGMKSGSGGGSPTSLWGLLFLSAA 
LSLWPTSGEICGPGIDIRNDYQQLKRLENCTV1 
EGYLHILLISKAEDYRSYRFPKLTVITEYLLLF 
RVAGLESLGDLFPNLTVIRGWKLFYNYALVIF 
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"NO- nf 

nucl- 
eotide 
seq- 
uence 


SEQID 
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seq- 
uence 
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iluu 


SEQ 
in XFn- 

in 

USSN 

09/496 

914 


Predicted 

oeginnmg 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine s 
U=As parti c Acid, b=Glutamic Acta, 
F=Phenylalanine. G=GIycine, H=Histidine, 
I=Isoleucine, K^Lysine, L^Leucine, 
M^Methionine, N=Asparagine, P=Prolme, 
(£=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, V^^possible 
nucleotide insertion 














EMTNLKDIGLYNLRNITRGVAIRJEKNADLCYL 

STVDWSLILDAVSNNYIVGNKPPKECGDLCP 

GTMEEKPMCEKTTINNEYNYRCWTTNRCQK 

MCPSTCGKRACTENNECCHPECLGSCSAPDN 

DTACVACRHYYYAGVCVPACPPNTYRFEGW 

RCVDRDFCANILSAESSDSEGFVIHDGECMQE 

CPSGFIRNGSQSMYCIPCEGPCPKVCEEEKKT 

KTID S VTS AQMLQG CTIFKGNLLINIRRGNNI A 

SELENFMGLIEVVTGYVKIRHSHALVSLSFLK 

NLRLILGEEQLEGNYSFYVLDNQNLQQLWD 

WDHRNLTIKAGKMYFAFNPKLCVSEIYRMEE 

VTGTKGRQSKGDINTRNNGERASCESDVLHF 

TSTTTSKNRIIITWHRYRPPDYRDLISFTVYYK 

EAPFKNVTEYDGQDACG5NSWNMVDVDLPP 

NKDVEPGILLHGLKPWTQYAVYVKAVTLTM 

VENDHIRGAKSEILYIRTNASVPSIPLDVLSAS 

NSSSQLIVKWNPPSLPNGNLSYYIVRWQRQP 

QDGYLYRHNYCSKDKIPIRKYADGTIDIEEVT 

ENPKTEVCGGEKGPCCACPKTEAEKQAEKEE 

AE YRKVFENFLHN SIFVPRPERKRRDVMQVA 

NTTMSSRSRNTTAADTYNITDPEELETEYPFF 

ESRVDNKERTVISNLRPFTLYRIDIHSCNHEAE 

KLGCSASNFVFARTMPAEGADDIPGPVTWEP 

RPENSIFLKWPEPENPNGLILMYEIKYGSQVE 

DQRECV SRQE YRK YGGAKLNRLNPGN YTARI 

QATSLSGNG SWTDPVFFYVQ AKRYENFIHLII 

ALPVAVLLIVGGLVIMLYVFHRKRNNSRLGN 

GVLYASVNPEYFSAADVYVPDEWEVAREKIT 

MSRELGQGSFGMVYEGVAKGVVKDEPETRV 

AIKTVNEAASMRERIEFLNEASVMKEFNCHH 

WRLLGWSQGQPTLVIMELMTRGDLKSYLR 

SLRPEmENNPVLAPPSLSKMIQMAGEIADGM 

AYLNANKFVHRDLAARNCMVAEDFTVKIGD 

FGMTRDIYETDYYRKGGKGLLPVRWMSPESL 

KDGVFTTYSDVWSFGWLWEIATLAEQPYQ 

GLSNEQVLRFV\MEGGLLDKJPDNCPDMLFEL 

MRMCWQYNPKMRPSFLEIISSIKEEMEPGFRE 

VSFYYSEENKLPEPEELDLEPENMESVPLDPS 

ASSSSLPLPDRHSGHKAENGPGPGVLVLRASF 

HPDriPV A U\jTK.m/^D fc'-MITD AT DI DAOCTP 

Un-Kl^r I Ai-IMIN uuKAlNbKALrLr^bb 1 L- 


821 


2171 


A 


6691 


106 


825 


GRVLFRGCGVGHKGQVLMGTFILAQDWLSE 

SNHVFCVSSMLRLQKRLASSVLRCGKKKVW 

LDPN ETNEI AN AN SRQQIR1CLIKDGLIIRKPVT 

VHSRARCRiCNTLARRKGRHMGIGKRKGTAN 

ARMPEKVTWMRRMRILRRLLRRYRES/KRYR 

ESKKIDRHMYFISLYLKVKGNVFKNKRILMEH 

IHKLKADKARKKLLADQAEARRSKTKEARK 

KKUbKLl^ AJVA.bbllK I L,olvbbb 1 ivix 


822 


2172 


A 


6715 


772 


21 


DFRPGLLLPRKKKMFGFHKPKMYRSIEGC\CI 

SGAKSSSS\RFTDSKRYEK\DFQ\SCFGLHETR\ 

SGDI\CNA\CVLL\LKRWKKLPAGSKK\NWNH 
WDARAGPSVT KTTT KPKKVKTT 1KAQT 

QISKLQKEFKR\HNSDAHSTTS\SASP\AQSPLF 
TVNQFRWTGSDTGVGFPGSNRNHPVFSFLDU 
TYWKRQKICCGJMYKGRFGEVLIDTHLFKPCC 
SNKKA\AAEKPEEQGPEPLPISTQEWVTEVFM 


823 


2173 


A 


6727 


3 


4063 


PYLATLQLDSSLLIPPKYQTPPAAAQGQATPG 
NAGPLAPNGSAAPPAGSAFNPTSNSSSTNPAA 
SSSASGSSVPPVSSSASAPGISQISTTSSSGFSGS 
VGGQNPSTGGISADRTQGNIGCGGDTDPGQS 
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NO: of 
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seq- 
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SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 
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nucleotide 
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ng to first 

amino acid 
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peptide 

sequence 
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nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^ Alanine OCysteine, 
JJ— Aspartic Acid, b = oiutamic aciq, 
F=Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S^Serme, 
T=Threonine, V=Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown. *-Stop codon, 
/^possible nucleotide deletion, \=possib!e 
nucleotide insertion 














SSQPSQDGQESNVPSVGSLADPDYLNTPQMN 

TP VTLNSAAPA SNSGAG VLPSPATPRFS VPTP 

RTPRTPRTPRGGGTASGQGSVKYDSTDQGSP 

ASTPSTTRPLNSVEPATMQPIPEAHSLYVTLIL 

SDSVMNIFKDRNFDSCCICACNMNIKGADVG 

LYIPDSSNEDQYRCTCGFSA1MNRKLGYNSGL 

FLEDELDIFGKNSDIGQAAERRLM\MCQSTFL 

PQVEGTKKPQEPPISLLLLLQNQHTQPF A SLN 

FLDYISSNNRQTLPCVSWSYDRVQADNNDY 

WTECFNALEQGRQYVDNPTGGKVDEALVRS 

ATVHSWPHSNVLDISMLSSQDWRMLLSLQP 

FLQDAIQKKRTGRTWENIQHVQGPLTWQQFH 

KMAGRGTYGSEESPEPLPIPTLLVGYDKDFLT 

ISPFSLPFWERLLLDPYGGHRDVAYIVVCPEN 

EALLEGAKTFFRDLSAVYEMCRLGQHKPICK 

VLRDGIMRVGKTVAQKLTDELVSEWFNQPW 

SGEENDNHSRLKLYAQVCRHHLAPYLATLQL 

DSSLLIPPKYQTPPAAAQGQATPGNAGPLAPN 

GSAAPPAGSAFNPTSNSSSTNPAASSSASGSSV 

PPVSSSASAPGISQISTTSSSGFSGSVGGQNPST 

GGISADRTQGNIGCGGDTDPGQSSSQPSQDG 

QESVTERERIGIPTEPDSADSHAHPPAVVIYM 

VDPFTYAAEEDSTSGNFWLLSLMRCYTEMLD 

NLPEHMRNSF1LQIVPCQYMLQTMKDEQVFY 

IQYLKSMAFSVYCQCRRPLPTQIHDCSLTGFGP 

AASIEMTLKNPERPSPIQLYSPPFILAPIKDKQT 

ELGETFGEASQKYNVLFVGYCLSHDQRWLL 

ASCTDLHGELLETCVVNIALPNRSRRSKVSAR 

KIGLQKLWEWCIGIVQMTSLPWRVVIGREGR 

LGHGELKDWSILLGECSLQTISKKLKDVCRM 

CGISAADSPSILSACLVAMEPQGSFVVMPDAV 

TMGSVFGRSTALNMQSSQLNTPQDASCTHIL 

VFPTSSTIQVAPANYPNEDGFSPNNDDMFVDL 

PFPDDMDNDIGILMTGNLHSSPNSSPVPSPGSP 

SGIGVGSHFQHSRSQGERLLSREAPEELKQQP 

LALGYFVSTAKAENLPQWFWSSCPQAQNVQC 

PLFLKASLHHHISVAQTDELLPARNSQRVPHP 

LDSKTTSDVLRFVLEQYNALSWLTCNPATQD 

RTSCLPVHFVVLTQLYNAIMNIL 


824 


2174 


A 


6732 


2440 


365 


VEEGLGRRRTPPGGRRGPVTPARPGPDSVRR 

RLLPPSSAAAFSSHRHNLLCSRRRGGGGGGG 

GGGGGTIKRPGITGPTAATSPSGEPGNAASAP 

LSLLSPFPGQTTYQHPGVAEPSAYGGRDVAC 

ASLVFGRLQHRGGDRKRGLLGRSSGDAASD 

QPFRCRSGSTAGRLVKQMDFTEAYADTCSTV 

GLAAREGNVKVLRKLLKKGRSVDVADNRG 

WMPIHEAAYHNSVECLQMLINADSSENYIKM 

KTFEGFCALHLAASQGHWKIVQILLEAGADP 

NATTLEETTPLFLAVENGQIDVLRLLLQHGAN 

VNGSHSMCGWNSLHQASFQENAEIIKLLLRK 

GANKECQDDFGITPLFVAAQYG\KLESL\SILIS 

SO VAN VNCQALDKA I r Lr lAAQbGM J KC VbLL 

LSSGADPDLYCNEDSWQLPIHAAAQMGHTKJ 

LDLLIPLTNRACDTGLNKVSPVYSAVFGGME 

DCLEILLRNGYSPDAQACLVFGFSSPVCMAFQ 

KDCEFFGIVNILLKYGAQINELHLAYCLKYEK 

FS1FRYFLRKGCSLGPWNHIYEFVNHAIKAQA 

KYKEWLPHLLVAGFDPLILLCNSWIDSVSIDT 

LIFTLEFTNWKTLAPAVERMLSARASNAWIL 

QQHIATVPSLTHLCRLEIRSSLKSERLRSDSYIS 
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Amino acid sequence (A=Alanine OCysteine, 

T\™. A < - k _ M _i-l _ A y*U T" 1 y^ti. l*-i-i--l-i.-t. A a.'#J 

D= As panic Acid, b=ulutamic Acid, 
F=PhenylaIanine, G=Glycine, H~Histidine, 
i~ xsoieucine, iv— Lysine, l,— L,eucme, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=VaIine, W=Tryptophan, 
Y=Tyrosine, X=Unknown. *=Stop codon, 
/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 














QLPLPRSLHNYLLYEDVLRMYEVPELAA1QD 
G 


825 


2175 


A 


6735 


277 


1252 


RIMGLFDRGVQMLLTTVGAFAAFSLMTIAVG 
TDYWLYSRGVCKTKSVSENETSKKNEEVMT 

novjL WK1 CCJLr^OiNrJ^ULL-Kt^llJHrrJbUAljyhf 

ADTAEYFLRAVRASSIFPILSVILLFMGGLCIA 

ASEFYKTRHNIILSAGIFFVSAGLSNIIGIIVYIS 

ANAGDPSKSDSKKNSYSYGWSFYFGALSFIIA 

EMVGVLAVHMFJDRHKQLRATARAMDYLQ 

ASAJTRIPSYRYRYQRRSRSSSRSTEPSHSRDA 

SPVGIKGFNTLPSTEISMYTL SRDPLKAATTPT 

ATYNSDRDNSFLQVHNCIQKJSNKJDSLHSNTA 

NRRTTPV 


826 


2176 


A 


6744 


3 


5177 


SDDLRTGLFQDVQDAESLKLPGVYEVLFYNE 

TEDCPGMMLWRYPEPRGLTLVRITPVPFNTT 

EDPDISTADLGDVLQDPCSLEYWDELQKVFV 

AFREFNLSESKVCELQLPDINL VNDQKKL VS S 

DLWRIVLNSSQNGADDQSSASESGSQSTCDPL 

VTPTALAACTRVDSCFTPWFVPSLCVSFQFAH 

LEFHLCHHLDQLGTAAPQYLQPFVSDRNMPS 

ELEYMIVSFREPHMYLRQWNNGSVCQEIQFL 

AQADCKLLECRNVTMQSWKPFSIFGQMAVS 

SDWEKLLDCTVIVDSVFVNLGQHWHSLNT 

AIQAWQQNKCPEVEELVFSHFV1CNDTQETL 

RFGQVDTDENILLASLHSHQYSWRSHKSPQL 

LHICIEGWGNWRWSEPFSVDHAGTFIRTIQYR 

GRTASLIIKVQQLNGVQKQ1IICGRQ1ICSYLSQ 

S1ELKVVQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLE 

SKAPEYSIVIQVPSSNSSHYVWCrVLTLEPNS 

QVQQRMTVFSPLFIMRSHLPDPniHLEKRSLGL 

SETQIIPGKGQEKPLQNIEPDLVHHLTFQAREE 

YDPSDCAVPISTSLIKQIATKVHPGGTVNQILD 

EFYGPEKSLQPIWPYNKKDSDRNEQLSQWDS 

PMRVKLSrWKPYVRTLLIELLPWAJLLINESKW 

DLWLFEGEKIVLQVPAGKIIIPPNFQEAFQIGIY 

WANTNTVHKSVAIKLVHNLTSPKWKDGGNG 

EWTLDEEAFVDTEIRLGAFPGHQKLCQFCIS 

SMVQQGIQIIQ1EDKTTIINNTPYQIFYKPQLSV 

CNPHSGKEYFRVPDSATFSICPGGEQPAMKSS 

SLPCWDLMPDISQSVLDASLLQKQIMLGFSPA 

PGADSSQCWSLPAIVRPEFPRQSVAVPLGNFR 

ENGFCTRAJVLTYQEHLGVTYLTLSEDPSPRV 

IIHNRCPVKMLIKENIKDIPKFEVYCKKIPSECS 

IHHEL YHQISS YPDCKTKDLLPS LLLRVEPLDE 

VTTEWSDAIDINSQGTQWFLTGFGYVYVDV 

VHQCGTVFITVAPEGKAGPILTNTNRAPEKIV 

TF/KMFITQLSLAVFDDLTHHKASAELLRLTL 

DNIFLCVAPGAGPLPGEEPVAALFELYCVEIC 

CGDLQLDNQLYNKSNFHFAVLVCQGEKAEPI 

QCSKMQSLLISNKELEEYKEKCFIKLCITLNEG 

iv oi i^v^L'ii.N jl-jt or i-^j_«rsa /ajvl, i v iz>xJ if v i i Xlvi l^r 

DTYLPNSRLAGHSTHLSGGKQVLPMQVTQH 
ARALVNPVKLRKLVIQPVNLLVSIHASLKLY1 
ASDHTPLSFSVFERGPIFTTARQLVHALAMHY 
AAGALFRAGWWGSLDILGSPASLVRSIGNG 
VADFFRJLPYEGLTRGPGAFVSGVSRGTTSFVK 
HISKGTLTSITNLATS LARNMDRLSLDEEH YN 
RQEEWRRQLPESLGEGLRQGLSRLGI SLLG AI 
AGIVDQPMQNFQKTSEAQASAGHKAKGVISG 
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SEQID 
NO* nf 
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seq- 
uence 
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ID NO- 
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09/496 

914 


Predicted 
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livtl 1 II 1 U 1 g 

nucleotide 
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amino acid 
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peptide 
seouence 


Predicted end 

miclfottflp 

location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, 
D— A'iriartic Acid F— Olutamic Acid. 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown. *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














VGKGIMGVFTKP1GGAAELVSQTGYGILHGA 

GLSQLPKQRHQPSDWHADQAPNSHVKYVW 

KMLQSLGRPEVHMALDVVLVRGSGQEHEGC 

LLLTSEVLFVVSVSEDTQQQAFPVTEIDCAQD 

SKQNNLLTVQLKQPRVACDVEVDGVRERLSE 

QQYNRLVDYITKTSCHLAPSCSSMQIPCPWA 

AEPPPSTVKTYHYLVDPHFAQVFLSKFTMVK 

NKALRKGFP 


827 


2177 


A 


6748 


2 


1662 


FVGAPRRGNPFGSPGNPGRHQGPCHRPRGTK 

ASGVSPTLWRPQAAATGLEMPSSGRALLDSP 

LDSGSLTSLDSSVFCSEGEGEPLALGDCFTVN 

VGGSRFVLSQQALSCFPHTRLGKLAVVVASY 

RRPGAJLAAVPSPLELCDDANPVDNEYFFDRS 

SQAFRYVLHYYRTGRLHVMEQLCALSFLQEI 

QYWGIDELSIDSCCRDRYFRRKELSETLDFKK 

DTEDQESQHESEQDFSQGPCPTVRQKLWN1L 

EKPGSSTAARIFGVISIIFVGVSIINMALMSAEL 

CU/T Til HI T FTI FWPT^WFTnFFVT PFI PVRTl 

RCRFLRKVPNnDLLAILPFYITLLVESLSG\SQT 

TQEL\ENVGAHCPGCLRLLRAL\RMLKAWGR 

HSTGLRSLGMTITQCYEEVGLLLLFLSVGISIF 

STVEYFAEQSIPDTTFTSVPCAWWWATTSMT 

TVGYGDIRPDTTTGKIVAFMCILSG1LVLALPI 

AIINDRFSACYFTlJaKEAAVRQREALKKLTK 

NIATDSYISVNLRDVYARSIMEMLRLKGRER 

ASTRSSGGDDFWF 


828 


2178 


A 


6786 


5672 


1360 


GTHPASSGPVPLPPAAVSAATREELGEPVPFV 

TASSGFQSMHSSNPKVRSSPSGNTQSSPKSKQ 

EVM VRPPTV M SPSGNPQLDSKFSNQGKQGG S 

ASQSQPSPCDSKSGGHTPKALPGPGGSMGLK 

NGAGNGAKGKGKRERSISADSFDQRDPGTPN 

DDSDIKECNSADHIKSQDSQHTPHSMTPSNAT 

APRSSTPPHGQTTATEPTPAQKTPAKWYVFS 

TEMANKAAEAVLKGQVETIVSFHIQN1SNNK 

TERSTAPLNTQISALRNDPKPLPQQPPAPANQ 

DQNS SQNTRLQPTPPIP APAPKP AAPPRPLDRE 

SPGVENKLIPSVGSPASSTPLPPDGTGPNSTPN 

NRAVTPVSQGSNSSSADPKAPPPPPVSSGEPPT 

LGENPDGLSQEQLEHRERSLQTLRDIQRMLFP 

DEKEFTGAQSGGPQQNPGVLDGPQKKPEGPI 

QAMMAQSQSLGKGPGPRTDVGAPFGPQGHR 

DVPFSPDEMVPPSMNSQSOnGPDHLDHMTP 

EQIAWLKLQQEFYEEKRRKPEQVWQQCSLQ 

DMMVHQHGPRGVVRGPPPPYQMTPSEGWAP 

GGTEPFSDGINMPHSLPPRGMAPHPNMPGSQ 

MRLPGFAGMINSEMEGPNVPNPASRPGLSGV 

SWPDDVPKIPDGRNFPPGQGIFSGPGRGERFP 

NPQGLSEEMFQQQLAEKQLGLPPGMAMEGIR 

PSMEMNRMIPGSQRHMEPGNNPIFPRIPVEGP 

LSPSRGDFPKGIPPQMGPGRELEFGMVPSGM 

KGDVNLNVNMGSNSQMIPQKMREAGAGPEE 

MLKLRPGGSDMLPAOOKMVPLPFGEHPOOF 

YGMGPRPFLPMSQGPGSNSGLRNLREPIGPDQ 

RTNSRLSHMPPLPLNPSSNPTSLNTAPPVQRG 

LGRKPLDISVAGSQVHSPGINPLKSPTN4HQVQ 

SPMLGSPSGNLKSPQTPSQLAGMLAGPAAAA 

SIKSPPVLGSAAASPVHLKSPSLPAPSPGWTSS 

PEPPLQSPGIPPNHKAPLTMASPAMLGNVESG 

GPPPPTASQPASVNIPGXSLPSSTPYTMPPEPTL 

SQNPL SIM\MSR\MSKFAM\PS\SNPG YNHDAJ 
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to last amino 
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of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
I>=Aspartic Acid, E^GIutamic Acid, 
F=Phenylalanine, G=GIycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y~Tyrosine, X-Unknown, *=Stop codon, 
/=possible nucleotide deletion, \-possible 
nucleotide insertion 














KTVASSDDDSPPARSPNLPSMNNMPGMGINT 

QNPRISGPNPWPMPTLSPMGMTQPLSHSNQ 

MPSPNAVGPNIPPHGVPMGPGLMSHNPIMGH 

UM^brJ'M Vri^LrRMurl^urrrVQbPrQQVFFP 

HNGPSGGQGSFPGGMGFPGEGPLGRPSNLPQ 

SSADAALCKPGGPGGPDSFTVLGNSMPSVFT 

DPDLQEVIRPGATGIPEFDLSRIIPSEKPSQTLQ 

YFPRGEVPGRKQPQGPGPGFSHMQGMMGEQ 

ArKMuLALruMOOrTjfr Vu 1 rDlrLLr 1 Ar E>MP 

GHNPMRPPAFLQQGMMGPHHRMMSPAQST 

MPGQPTLMSNPAAAVGMIPGKDRGPAGLYT 

HPGPVGSPGMMMSMQGMMGP\NRTS 


829 


2179 


A 


6797 


433 


3 


ASFFNFSICICKHLEVGPPVGHPAHDDVGGRH 

GPGGR/GSRSPRSLQCAPGGGRRSGCPAGSSP 

ASTCPPSPGGSGADRFGPSPPPPSREAAPTAG 

AAASSTSSGASCPPVPASSRWGVRSRTRSGSG 

GEREPRDRPSERPRLV 


830 


2180 


A 


6800 


3 


1911 


LPERAFGPRTPRAPRRRRRRLLLSPPPRPPPPL 

DREPRAPGPWLCPSRAGTAQDPARIRERRGR 

VAGGAAGPAMELRARGWWLLCAAAALVAC 

ARGDPASKSRSCGEVRQIYGAKGFSSS\DVPQ 

AEI SG EH LKJ CPQ G YTCCTSEMEENL ANRSH A 

ELETALRDSSRVLQAMLATQLRSFDDHFQHL 

LNDSERTLQATFPGAFGELYTQNARAFRDLY 

SELRLYYRGANLHLEETLAEFWARLLERLFK 

QLHPQLLLPDDYLDCLGKQAEALRPF\GEAP\ 

RELRLRAT\RAVFVAAR\SFVQGLGVAS\DVVR 

KVAQVPLG\PEC\SRAVIEAGSYC/ALHCV G VP 

GARPCPDYCRNVLKGCLANQADLDAEWRNL 

LDSMVLITDKFWGTSGVESVIGSVHTWLAEA 

INALQDNRDTLTAKVIQGCGNPKVNPQGPGP 

EEKRRRGKLAPRERPPSGTLEKLVSEAKAQL 

RDVQDFWISLPGTLCSEKMALSTASDDRCWN 

GMARGRYLPEVMGDGLANQINNPEVEVDIT 

KPDMTIRQQIMQLKTMTNRLRSAYNGNDVDF 

QDASDDGSGSGSGDGCLDDLCGRKVSRKSSS 

SRTPLTHALPGLSEQEGQKTSAASCPQPPTFL 

LPLIXFLALTVARPRWR 


831 


2181 


A 


6808 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEA 

EGRLREKLFSGYDSSVRPAREVGDRVRVSVG 

LILAQLISLNEKDEEMSTKVYLDLEWTDYRLS 

WDPAEHDGIDSLRITAESVWLPDWLLNNND 

GNFDVALDISVWSSDGSVRWQPPGIYRSSCS 

IQVTYFPFDWQNCTMVFSSYSYDSSEVSLQT 

GLGPDGQGHQEIHIHEGTFIENGQWENIHKPS 

KL IQPPGDPRGGREGQRQEVIF YLI IRRKPLFY 

LVNVIAPCILITLLAIFVFYLPPDAGEKMGLSIF 

ALLTLTVFLLLLADKVPETSLSVPIIIKYLMFT 

MVL VTFSVIL SVWLNLHHRSPHTHQMPL WV 

RQIFIHKLPLYLRLKRPKPERDLMPEPPHCSSP 

GSGWGRGTDEYFBRKPPSDFLFPKPNRFQPEL 

PADr\T Dt> T7TF\^DXTT> A VAT f DC I DT?\rt^OClCVI A 

oAr ULKKr ILfKjr IN KA V ALLj tiLKx. V V £>oio Y LA 

RQLQEQEDHDALKEDWQFVAMWDRLFLW 
TFIIFTSVGTLWIFLDATYHLPPPDPFP 


832 


2182 


A 


6824 


71 


1079 


ETMAKNPPENCEDCHILNAEAFKSKKICKSLK 

ICGLVFGILALTLIVLFWGSKHFWPEVPKKAY 

DMEHTFYSNGEKKKIYMEIDPVTRTEIFRSGN 

GTDETLEVHDFKNGYTGIYFVGLQKCFIKTQI 

KVIPEFSEPEEEIDENEEnTTFFEQSVIWVPAE 

KPIENRDFLKNSKJLEICDNV7MYW\INPTL\IS 
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SEQID 
NO: Ot 
nucl- 
eotide 
seq- 
uence 


SEQID 

NL>. 01 

peptide 
seq- 
uence 


Met i 

i 

nou 


SEQ 
in 

USSN 

09/496 

914 


Predicted 

Dcginning 

nucleotide 

location 

correspond] 

ng to first 

ailllllKJ o\tl\3 

residue of 

peptide 

sequence 


Predicted end 
nucicou qc 
location 
corresponding 
to last amino 
acid residue 

of npnrirlp 

sequence 


Amino acid sequence (A=Alanine C=Cysteine, 

TV— A enni-t «f» Ar*iH P^flliFtatnif* AriH 

F^PhenylaJanine. G=<Jlycine, H-Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
Q=GIutamine, R=Arginine, S= Serine, 
T=Threonine V—V aline W=TrvDtODhan 
Y=Tyrosine, X=Unknown r *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














GTFAKQLHHNFAF1ILVSELQDFEEEGEDLHFP 

ANEKKGIEQNEQWWPQVKVEKTRHARQAS 

EEELPINDYTENGIEFDPMLDERGYCCIYCRR 

GNRYCRRVCEPLLGYYPYPYCYQGGRVICRV 

IMPCNWWVARMLGRV 


833 


2183 


A 


6846 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETAR1GP 

GVMESKEERALNNLIVENVNQENDEKDEKE 

Q VANKGEPL ALPLNV SEYC VPRGNRRRFRVR 

QPILQYRWDIMHRLGEPQARMREENMERJGE 

EVRQLMEKLREKQLSHSLRAVSTDPPHHDHH 

DEFC\LMP 


834 


2184 


A 


6851 


3 


2024 


mGVALLHLPGAAVIPNTNYMFQDALGGRSR 

GSREESPAPSRAPASASLWRRLVVVEAKMAA 

HAAAAAQAAAAQAAHAEAADSWYLALLGF 

AEHFRTSSPPKIRLCVHCLQAVFPFKPPQR1EA 

RTHLQLGSVLYHHTKNSEQARSHLEKAWLIS 

QQIPQFEDVKFEAASLLSELYCQENSVDAAKP 

LLRKAIQISQQTPYWHCRLLFQLAQLHTLEKD 

LVSACDLLGVGAEYARVVGSEYTRALFLLSK 

GMLLLMERKLQEVHPLLTLCGQIVENWQGN 

PIQKESLRVFFLVLQVTHYLDAGQVKSVKPC 

LKQLQQC1QTISTLHDDEILPSNPADLFHWLP 

KEHMCVLVYL VTVMH SMQAGYLEKAQKYT 

DKALMQLEKLKMLDCSPILSSFQVELLEHIIM 

HAAQLHTLLGLYCVSVNCMDNAEAQFTTAL 

RLTNH Q EL W AFI VTNL A S VY I REGNRHQ E V V\ 

LYSLLERINPDHSFPVSSHCLRAAAFYVRGLF 

SFFQGRYNEAKRFLRETLKMSNAEDLNRLTA 

CSLVLLGHIFYVLGNHRESNNMWPAMQLAS 

KIPDMSVQLWSSALLRDLNKACGNAMDAHE 

AAQMHQNFSQQLLQDHIEACSLPEHNLITWT 

DGPPPVQFQAQNGPNTSLASLL 


835 


2185 


A 


6855 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCL 

O VoOlVJOrL V oLool^XlvlA^ 1 L./V V O V I AJUlvT W O 

AYVPCQTQDRDALRLTLEQIDLIRRMC AS Y SE 

LELVTSAKALNDTQKLACLIGVEGGHSLDNS 

LSILRTFYMLGVRYLTLTHTCNTPWAESSAK 

vj v nor i inj^iovji_/ i ur vjejv v v /tjdiviin iv i_j vj iyiivi v 

DLSHVSDAVARRALEVSQAPVIFSHSAARGV 

CNSARNVPDDILQLLEEERWAFVMVSLFHGE 

LIQWQPIRPMCSTVADHFDHIKAVMGSKFIGI 

GGDYDGAGKYRKKTTCKAPWRTSSRMSS 


836 


2186 


A 


6862 


315 


11 


PPRSRPSCWRXKVGPGRPWWWGGTGPPGQG 
RPEIRLLPLPMTGACGAVAASRTGSSGPG/SSL 
PNGHGGKGSGLANGLAGMAGHLGLGSSFGT 
GPGSGRPPP 


837 


2187 


A 


6863 


2 


1615 


VLRGQRGPAG GLAEERRRGRNE WRIHDVTT 

APFPGLVQRRSRLLIVSQVRYFLKNKVSPDLC 

NEDGLTALHQ CCIDNFEE1VKLLLSHGANVN 

AKTVWF1 WTPI HAAATPGHnsfl VK1I VOYGA 

DLLAVNSDGNMPYDLCEDEPTLDV1ETCMAY 

QGITQEKINEMRVAPEQQK4IADIHCM1AAGQ 

DLDWIDAQGATLLHIAGANGYLRAAELLLDH 

GVRVDVKDWDGWEPLHAAAFWGQMQN1AE 

LLVSHGAN\LNARTSMDEMPIDLCEEEEFKVL 

LLELK\HKHDVIMKSQLRHKSSLSRRTSHRQA 

S/SVGKVVRRTQPVGTGPNL\YRKEYE/GEEAI 

LWQRSAVAEDQRTSTYNGDIRETVRTDQENKD 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



838 



2188 



Met 
hod 



SEQ 
ID NO: 
in 

USSN 

09/496 

914 



6865 



Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



6291 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



739 



Amino acid sequence (A^Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutarnic Acid, 
F=PhenylaIanine, G=Glycine, H==Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown. *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 



PNPRLEKAPVLLSEFPTKIPRGELDMPVENGLR 

APVSAYQYAEANGDVWKVHEVPDYSMAYG 

NPGVADATPPWSSYKEQSPQTLLELKRQRAA 

AKLLSHPFLSTHLGSSMARTGESSSEGKAPLI 

GGRTSPYSSNGTSV YYTVTSGDPPLLKFKAPI 

EEMEEKVHGCCRIS 



AGPLEPRVQGAMALQLWALTLLGLLGAGAS 

LRPRKLDFFRSEKELNHLAVDEA SGVV YLGA 

VNALYQLDAKLQLEQQVATGPVLDNKKCTP 

PIEASQCHEAEMTDNVNQLLLVDPPRKRLVE 

CGQLLKGI\CALRALSNISLRLFYEDGSGEKSF 

VASNDEGVATVGLVSSTGPGGDRVLFVGKG 

NGPHDNGIIYSTRLLDRTDSREAFEAYTDHAT 

YKAGYLSTNTQQFVAAFEDGPYVFFVFNQQD 

KHPARNRTLLARMCREDPNYYSYLEMDLQC 

RDPDIHAAAFGTCLAASVAAPGSGRVLYAVF 

SRDSRSSGGPGAGLCLFPLDEVHAKMEANRN 

ACYTGTREARDIFYKPFHGDIQCGGHAPGSSK 

SFPCGSEHLPYPLGSRDGLRGTAVLQRGGLN 

LTAVTVAAENNHTVAFLGTSDGRILKVYLTP 

DGTSSEYDSILVEINKRVKRDLVLSGDLGSLY 

AMTQDKVFRLPVQECLSYPTCTQCRDSQDPY 

CG WC VVEGRCTRKAECPRAEEA SH WL WSRS 

KSCVAVTSAQPQNMSRRAQGEVQLTVSPLPA 

LSEEDELLCLFGESPPHPARVEGEAVICNSPSS 

IPVTPPGQDHVAVTIQLLLRRGNIFLTSYQYPF 

YDCRQAMSLEENLPCISCVSNRWTCQWDLR 

YHECREASPNPEDGIVRAHMEDSCPQFLGPSP 

LVIPMNHETDVNFQGKNLDTVKGSSLHVGSD 

LLKFMEPVTMQESGTFAFRTPKLSHDANETL 

PLHLYVKSYGKNIDSKLHVTLYDCSFGRSDC 

SLCRAANPDYRCAWCGGQSRCVYEALCNTT 

SECPPPVITRIQPETGPLGGGIRITILGSNLGVQ 

AGDIQRISVAGRNCSFQPERYSVSTRIVCVIEA 

AETPFTGGVEVDVFGKLGRSPPNVQFTFQQP 

KPLSVEPQQGPQAGGTTLTIHGTHLDTGSQED 

VRVTLNGVPCKVTKFGAQLQCVTGPQATRG 

QMLLEVSYGGSPVPNPGIFFTYRENPVLRAFE 

PLRSFASGGRSINVTGQGFSLIQRFAMVVIAEP 

LQSWQPPREAESLQPMTWGTDYVFHNDTK 

WFLSPAVPEEPEAYNLTVLIEMDGHRALLRT 

EAGAFEYVPDPTFENFTGGVKKQVNKLIRAR 

GTNLNKAMTLQEAEAFVGAERCTMKTLTET 

DLYCEPPEVQPPPKRRQKRDTTHNLPEFIVKF 

GSRE WVLGRVEYDTRVSDVPLSLILPL VI VPM 

WVIAVSVYCYWRKSQQAEREYEKIKSQLEG 

LEESVRDRCKKEFTDLMIEMEDQTNDVHEAG 

IPVLDYKTYTDRVFFLPSKDGDKDVMITGKL 

DIPEPRRPWEQALYQFSNLLNSKSFLINFIHT 

L\ENQPEFSARAKVYFASLLTVALHGKLEYYT 

DIMFITLFLELLEQYVVAKNPKLMLRRSETVV 

ERMLSNWMSICLYQYLKDSAGEPLYKLFKAI 

KHQVEKGPVDAVQKKAKYTLNDTGLLGDD 

VEYAPLTVSVIVQDEGVDAIPVKVLNCDTISQ 

VKEKIIDQVYRGQPCSCWPRPDSVVLEWRPG 

STAQILSDLDLTSQREGRWKRVNTLMHYNVR 

DGATLILSKVGVSQQPEDSQQDLPGERHALL 

EEENRVWHLVRPTDEVDEGKSKRGSVKEKE 

RTKAITEIYLTRJLLSVKGTLQQFVDNFFQSVL 

APGHAVPPAVKYFFDFLDEQAEKHNIQDEDT1 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: or 
peptide 
seq- 
uence 


Met 
nod 


SEQ 

ir\ "MO- 
ID INLJ. 

in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
resiuuc ui 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 

otu utilise 


Amino acid sequence (A=AIanine C=Cy stein e, 

T\=: A cnaH if* Af*ir1 T?zsf"I1ii+nnri if* AfifJ 
Lf—r\2>l)d-L 1 1 v. J3> — VJLUuUIllL- /al,IU ? 

F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M=Methionine, N^Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine ? 
T-Threonine, V=Valine, W=Tryptophan, 

Y=T"vm«?inp "X=I fnknnwn *=Rton (Virion 

/possible nucleotide deletion, \=possible 
nucleotide insertion 














HIWKTNSLPLRFWVNILKNPHFIFDVHVHEVV 

DASLSV1AQTFMDACTRTEHKLSRDSPSNKLL 

YAKEISTYKKMVEDYYKGIRQMVQVSDQDM 

NTHLAEISRAHTDSLNTLVALHQLYQYTQKY 

YDEIINALEEDPAAQKMQLAFRLQQIAAALE 

NKVTDL 


839 


2189 


A 


6872 


1 


1485 


RARRLALQCHVCVCALTPGEQSGRRLPGQT 

WLMFSCFCFSLQDNSFSSTTVTECDEDPVSLH 

EDQTDCSSLRDENNKENYPDAGALVEEHAPP 

SWEPQQQNVEATVLVDSVLRPSMGNFKSRKP 

KSIFKAESGRSHGESQETEHVVSSQSECQVRA 

GTPAHESPQNNAFKCQETVVRL\QPR1DQRTAT 

SPKDAFETR\QDLNEEEAAQVHGVKDPAPAS 

DLHSVGTSRLLL/YHITDGDNPTAVRHGCSL/F 

SGQSQRFNLDPESAPSPPSTQQFMMPRSSSRC 

SCGDGKEPQTITQLTKHIQSLKRKIRKFEEKFE 

QEKKYRPSHGDKTSNPEVLKWMNDLAKGRK 

QLKELKXKLSEEQGSAPKGPPRNLLCEQPTVP 

RENGKPEAAGPEPSSSGEETPDAALTCLKERR 

EQLPPQEDSKVTKQDKNLIKPLYDRYRIIKQIL 

STPSLIPTIVSQDTCMLLLCTDV 


840 


2190 


A 


6873 


2 


2054 


FFRFYFSFIRLFAMSLADLTKTN1DEHFFGVAL 

ENNRRSAACKRSPGTGDFSRNSNASNKSVDY 

SRSQCSCGSLSSQYDYSEDFLCDCSEKAINRN 

YLKQPVVKEKEKKKYNVSKJSQSKGQKEISV 

EKKHTWNASLFNSQIHMIAQRRDAMAHRILS 

ARLH KJ KG LKNEL ADMHHKLE AI LTEN QF LK 

QLQLRHLKAIGKYENSQNNLPQIMAKHQNEV 

KNLRQLLRKSQEKERTLSRJKLRETDSQLLKT 

KDILQALQKLSEDKNLAEREELTHKXSIITTK 

MDANDKKIQSLEKQLRLNCRAFSRQLAIETR 

KTLAAQTATKTLQVEVKHLQQKLKEKDREL 

EIKNIYSHRILKNLHDTEDYPKVSSTKSVQAD 

RKJ LPFTSMRHQGTQK SD VPPL/TTK G KKATG 

EDL SGEEKHLEVQILLENTGRQKDKKEDQEK 

KN1FVKEEQELPPK1IEVIHPERESNQEDVLVR 

EKFKRSMQRNGVDDT\LGKGTAPYTKGPLRQ 

RRHYSFTEATENLHHGLPASGGPANAGNMR 

YSHSTGKHLSNREEMELEHSVDSGYEPSFGKS 

SRIKVKDTTFRDKKSSLMEELFGSGYVLKTD 

QSSPGVAKGSEEPLQSKESHPLPPSQASTSHA 

FGDSKVTVVNSIKPSSPTEGKRK11I 


841 


2191 


A 


6874 


3 


2867 


S SRTREMEEKEILRRQIRLLQGLIDD YKTLHG 

NAPAPGTPAASGWQPPTYHS GRAFS ARYPRP 

SRRG Y SSHH GPS WRKK Y SL VNRPPG PSDPPA 

DHAVRPLHGARGGQPPVPQQHVLERQVQLS 

QGQNWIKVKPPSKSGSASASGAQRGSLEEFE 

DTPWSDQRPREGEGEPPRGQLQPSRPTRARG 

TCSVEDPLLVCQKEPGKPRMVKSVGSVGDSP 

RKLGSHSVASCAPQLLGDRRVDAGHTDQPVP 

SGSVGGPARPASGPRQAREASLVVTCRTNKF 

RKNNYKWVAASSKSPRVARRALSPRVAAEN 

VCKASAGMANKVEKPQLIADPEPKPRKPATS 

SKPGSAPSKYKWKASSPSASSSSSFRWQSEAG 

SKDHASQLSPVLSRSPSGDNRPALAHSGLKPLS 

GETPLSAYKVKTRTKIIRRRGSTSLPGDKKSG 

TSPAATAKSHLSLRRRQALRGKSSPVLKKTPN 



260 



WO 01/57188 



PCT/US01/03800 



NO: of 
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seq- 
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seq- 
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rTcuicieu 

beginning 

nucleotide 
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correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, 
D=Aspartic Acid. E=GIutamic Acid, 
F=Phenylalanine, GKHycine, H=Histidine, 
Msoleucine, K^Lysine, L=Leucine, 
M=Methioriine, N=Asparagine, P= Proline, 
Q=GIutamine, R=Arginine T S=Serine, 
T=Threonine I V=Valine, W^Tryptophan, 
Y=Tyrosine s X=Unknown, *-Stop codon, 
^=possibie nucleotide deletion, V=possibIe 
nucleotide insertion 














KGLVQVTKHRLCRLPPSRAHLPTKEASSLHA 

VRTAPTSKVIKTRYRIVKKTPASPLSAPPFPLS 

LPSWRARRLSLSRSLVLNRLRPVASGGGKAQ 

PGSPWWRSKGYRCIGGVLYKVSANKLSKTSG 

QPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 

KbJLA I IKl^ AKQRRbKJcICh Y CM YYN RF GRCNR 

GERCPYIHDPEKVAVCTRFVRGTCKKTDGTC 

PFSHHVSKEKMPVCSYFLKGICSNSNCPYSHV 

YVSRKAEVCSDFLKGYCPLGAKCKKKHTIXC 

PDFARRGACPRGAQCQLLHRTQKRHSRRAAT 

.SPAPGPSDATARSRVSASHGPRKPSASQRPTR 

QTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 

SSSSSSSSPPASLDHEAAPSLQEAALAAACSNR 

LCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDS 

GKPLHIKPRL 


842 


2192 


A 


6898 


506 


2071 


WPDLVHTWSSEEAMGSCCSCPDKDTVPDNH 

RNKFKVINVDDDGNELGSGIMELTDTELILYT 

RKRDSVKWHYLCLRRYGYDSNLFSFESGRRC 

QTGQGIFAFKCARAEELFNMLQEIMQNNSIN 

WEEPVVERNNHQTELEVPRTPRTPTTPGFAA 

QNLPNGYPRYPSFGDASSHPSSRHPSVGSARL 

PSVGEESTHPLLVAEEQVHTYVNTTGVQEER 

KNRTSVHVPLEARVSNAESSTPKEEPSSIEDR 

DPQILLEPEGVKFVLGPTPVQKQLMEKEKLE 

QLGRDQVSGSGANNTEWDTGYDSDERRD AP 

SVNKLVYENINGLSIPSASGVRRGRLTSTSTSD 

TQNINNSAQRRTALLNYENLPSLPPVWEARK 

LSRDEDDNLGPKTPSLNGYHNNLDPMHNYV 

NTENVTVPASAHKIEYSRRRDCTPTVFNFDIR 

RFbLhHRQLN Y IQ VDLEGGSD SDNPQTPKTPT 

TPLPQTPTRRTELYAVIDIERTAAMSNLQKAL 

PRDDGTSR\KTRHNST\DLPL 


843 


2193 


A 


6919 


2 


663 


AGRPGTTHASGKMAYQSLRLEYLQIPPVSRA 

YTTACVLTTAAVQLELITPFQLYFNPELIFKHF 

{Jl WKL1 1 Nr Lr r UP VOrNrLFNMIFLYRYCRM 

LEEGSFRGRTADFVFMFLFGGFLMTLFGLFVS 

L/VFLGPGLYNN/GSSMCGAEVEPLCPHELLRP 

SQLPGPLSALGAHGIFLWGELNHCGPFGYCS 

WTHIFFLGRCISQSTWWNKNSENTIYFESYF 


844 


2194 


A 


6928 


902 


366 


HRLCMPIQGACGERME/FSLLLPGLECNGVIL 
AHCNLRLPGSSNSPASASQVAGITGVCHHAR 
LIFVFSVETGFLHAGQAGLELLTSGDPPASAS 
QSAGITGKSQHTRPGYEFIIPYSAAQEDALKA 
LM 


845 


2195 


A 


6939 


1660 


317 


LYPENLGESLFPILLLPPPWPDGGRPCCVEMS 

TRAKKLRRIWRILEEKESVAGAVQTLLLRSQE 

GGV\TSAAASTLSEPPRRTQESRTRTRALGLPT 

LPMEKLAASTEPQGPRPVLGRESVQVPDDQD 

FRSFRSECEAEVGWNLTYSRAGVSVWVQAV 

FMDRTT TTkTTT<rPRMT'Pr r nVP APT! VTYV/T urm? 

YRKKWDSNVIETFDIARLTVNADVGYYSWR 
CPKPLfCNRD VITLRS WLPMG AD YIIMNYS VK 
HPKYPPRKDLVRAVS IQTGYLIQSTGPKSCVTT 
YLAQVDPKGSLPKWWNKSSQFLAPKAMKK 
MYKACLKYPEWKQKHL\PHFKPWL\HPEQSP 
LPSLALS\ELSVQHADS\LENIDESAV\AESREE 
R\MGGAGGEG\SDDDTSLYAEAPHRFRETETG 
PGAGRALGAAAAPALSPLHPPGTWWHRARP 
RRVLQPG WTEPQ 
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SEQ ID 
NU. 01 
peptide 
seq- 
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FlOu 


SEQ 

TT\ "MO- 
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in 
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location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 

TV^Aetnarfil^ As>t/1 P— iitafYll F Af*1/1 

L^~/\ i> p uTU /VLHJ, C — UlUloJIlil* AVUJUj 

F-Phenyl alanine, G=Grycine, H=Histidine, 
I=Iso leucine, K=Lysine, L=Leucine, 
M=Methioiiine, N=Asparagine, P=Proline, 
Q=Gtutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
A=possible nucleotide deletion, \=possibIe 
nucleotide insertion 


846 


2196 


A 


6944 


42 


2672 


RRKMAGCRGSLCCCCRWCCCCGERETRTPE 

ELTILGETQEEEDEILPRKDYESLDYDRCINDP 

YLEVLETMDNKKGRRYEAVKWMVVFAIGV 

CTGLVGLFVDFFVRLFTQLKFGWQTSVEECS 

QKGCLALSLLELLGFNLTFVFLESLLGLIEPVE 

AGSGITEGKCYLYARQVPGLVRLPTLLWKAL 

GVLLTVAAMLLAGLGSPMIHSGSWGAGLPQ 

FQSISLRK1QFNFPYFRSDRYGK\DKRDFVSAG 

AAAGVAAAFGAPIGGTLFSLEEGSSFWNQGL 

TWKVLFCSMSA1FILNFFRSGIQFGSWGSFQL 

PGLLNFGEFKCSDSDKKCHLWTAMDLGFFV 

VMGVIGGLLGATFNCLNKRLAKYRMRNVHP 

KPKLVRVLESLLVSLVTTWVFVASMVLGEC 

RQMSSSSQIGNDSFQLQVTEDVNSSIKTFFCP 

NDTYNDMATLFFNPQESAILQLFHQDGTFSPV 

TLALFFVLYFLLACWTYGISVPSGLFVPSLLC 

GAAFGRLVANVLKSYIGLGHIYSGTFALIGAA 

AFLGG WRMTI SLTVILIEST\NEITYGLPIMVT 

LMVGKWTGDFFNKGI\YDIHVGLRGVPLLEW 

bl h V cMDNJUKAbUIMbrNLl Y V YrHl KJl^bLV 

SILRTTVHHAFPVVTENRGNEKEFMKGNQLIS 

NNIKFKKSSILTRAGEQRKRSQSMKSYPSSEL 

RNMCDEHIASEEPAEKEDLLQQMLERRYTPY 

PNLYPDQSPSEDWTMEERFRPLTFHGLILRSQ 

LVTLLVRGVCYSESQSSASQPRLSYAEMAED 

YPRYPDIHDLDLTLLNPRMTVDVTPYMNPSPF 

TVSPNTHVSQVFNLFRTMGLRHLPVVNAVGE 

rvGIITRHNLTYEFLQARLRQHYQTI 


847 


2197 


A 


6951 


3 


1994 


NTNSSS VTN S AAGVEDLNIVQVTVPDNEKER 

LSSIEKIKQLREQVNDLFSRKFGEAIGVDFPVK 

VPYRKITFNPGCVVIDGMPPGVVFKAPGYLEI 

SSMRRILEAAEFUCFTVLRPLPGLELSNGEYST 

VGKRKIDQEGRVFQEKWERAYFFVEVQNIST 

CLICKRSMSVSKEYNLRRHYQTNHSKHYDQY 

MERMRDEKLHELKKGLRKYLLGLSDTECPE 

QKQVFANPSPTQKSPVQPVEDLAGNLWEKLR 

EKIRSFVAYSIAIDEITDfNNTTQLAIFIRGVDE 

NFDVSEELLDTVPMTGTKSGNEIFSRVEKSLK 

NFCINWSKLVSVASTGTPPMVDANNGLVTKL 

KSRVATFCKGAELKSICCIIHPESLCAQMCLKM 

JJil V MU V V V 1S.O V IN W ICoKULIN JlOiir i 1 LL I CL 

DSQYGSLLYYTEIKWLSRGLVLKJRPFESLEEI 
DSFMSSRGKPLPQLSSIDWIRDLAFLVDMTM 
HT "NAT "NI19T nOH^OT VTOMYHT TR ART AKI CI 

WETHLTRNNLAHFPTLKLVSRNESDGLNYIP 
KJAELKTEFQKRLSDFKLYESELTLF SSPFSTKI 
DSVHEELQMEVIDLQCNTVLKTKYDKVGIPE 
FYKYLWGSYPKYKHHCAK1LSMFGSTY1CEQ 
LFSIMKJLSKTKYC SQLKDSQ WDSVLHIAT 


848 


2198 


A 


6985 


3 


289 


SVQYLPGRPTRTHASTDAPLMLKFTPLPSKTK 
ASAPVQCLLLMAATFSPQGLAKPHSGTIPIT\C 


849 


2199 


A 


6999 


963 


5 


LDFLCHRDMGDNITSITEFLLLGFPVGPRIQM 

LLFGI^SIJYWTLLGNGTILGLISLDSRLHAP 

MYFFLSHLXAVVDIAYACNTVPRMLVNLLHP 

AKPISFAGRMMQTFLFSTFAVTECLLLVVMS 

YDLYWAICHPLRYLAIMTWRVCITLAVTSWT 

TGVLLSLJJHLVLLLPLPFCRPQKIYHFFCEJXA 

VLKLACADTHINENMVLAGAISGLVGPLSTIY 

VSYMCJXCAILQIQSREVQRKAFCTCFSHLCV1 
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GLFYGTAIIMYVGPRYGNPKEQKKYLLLFHS 
LFWMLNPLICSLRNSEVKNTLKRVLGVERAL 


850 


2200 


A 


7001 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAI 

DPLRVAPLPLYAAIFLVGVPGNAMVAWVAG 

KVARRRVGATWLLHLAVADLLCCLSLPILAV 

PIARGGHWPYGAVGCRALPSHLLTMYASVLL 

LAALSADLCFLALGPAWXCLRFS/GACGVQVA 

CGAAWTLALLLTVPSAIYRRLHQEHFPARLQ 

CWDYGGSSSTENAVTAIRFLFGFLGPLVAVA 

SCHS ALLCWAARRCRPLGTAIVVGFFVC WAP 

YHLLGLVLTVAAPNSALLARALRAEPLIVGL 

ALAHSCLNPMLFLYFGRAQLJRRSLPAACHW 

ALRESQGQDESVDSKKSTSHDLVSEMEV 


851 


2201 


A 


7011 


1 


2310 


AAASPLRMSRKGPRAEVCADCSAPDPGWASI 

SRGVLVCDECCSVHRSLGRHISIVKHLRHSA 

WPPTLLQMVHTLASNGANSI WEH SLLDPAQ V 

QSGPALKQTPKDKVXHPIKSEFIRAKYQMLAF 

VHKLPCRDDDGVTAKDLSKQLH SS VRTGNLE 

TCLRLLSLGAQANFFHPEKGTTPLHVAAKAG 

QTLQAELLVVYGADPGSPDVNGRTPIDYARQ 

AGHHELAERL VECQ YELTDRLAF Y LCGRXPD 

HKNGHYI IPQMADSLDLSELAKAAKKKLQ AL 

SNRLFEEL AMD VYD E VDRREN D A V W L ATQN 

HSTLVTERSAVPFLPVNPEYSATRNQGRQKL 

ARFNAREFATLIIDILSEAKRRQQGKSLSSPTD 

NLELSLRSQSDLDDQHDYDSVASDEDTDQEP 

LRSTGATRSNRARSMDSSDLSDGAVTLQEYL 

ELKKALATSEAKVQQLMKVNSSLSDELRRLQ 

REIHKLQAENLQLRQPPGPVPTPPLPSERAEH 

TPMAPGGSTHRRDRQAFSMYEPGSALKPFGG 

PPGDELTTRLQPFHSTELEDDA1YSVHVPAGL 

YRIRKGVSASAVPFTPSSPLLSCSQEGSRHTSK 

LSRHGSGADSDYENTQSGDPLLGLEGKRFLE 

LGKEEDFHPELESLDGDLDPGLPSTEDVILKT 

EQVTKNIQELLRAAQEFKHDSFVPCSEKIHLA 

VTEMASLFPKRPALEP VRSSLRLLN A SA YRLQ 

SECRKTVPPEPGAPVDFQLLTQQVIQCAYDIA 
KAAKQLVTITTREKKQ 


852 


2202 


A 


7016 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLL 

TVKGLLKPSFSPRNYKALSEVQGWKQRMAA 

KELARQNMDLGFKLLKKLAFYNPGRNIFLSP 

LSISTAFSMLCLGAQDSTLDEIKQGFNFRKMP 

EKDLHEGFHYIIHELTQKTQDLKLSIGNTLFID 

QRLQPQRKFLEDAKNFYSAETILTNFQNLEM 

AQKQINDFI/ESKTHGKINNLIENIDPGTVMLL 

ANYIFFRARWKHEFDPNVTKEEDFFLEKNSS 

VKVPMMFRSGIYQVGYDDKJLSCTILEIPYQK 

NiTAIFILPDEGKLKHLEKGLQVDTFSRWKTL 

LSRRWDVSVPRLHMTGTFDLKKTLSYIGVS 

KIFEEHGDLTK1APHRSLKVGEAVNKAELKM 

DERGTEGAAGTGAQTLPMETPLVVKIDKPYL 

LLIYSEKIPSVLFLGKIVNPIGK 


853 


2203 


A 


7017 


1 

i 


3293 


MTHACNPSTLGGQGRRTTRSHGRRRSSRGPV 

ARHV AAGAGHENKHGGSRRFPAG VAPRRAM 

ANVSKKVSWSGRDRDDEEAAPLLRRTARPG 

GGTPLLNGAGPGAARQSPRSALFRVGHMSSV 

ELDDELLEP\DMDPPHPFPKEIPHNEKLLSLKY 

ESLDYDNSENQLFLEEERRJNHTAFRTVE1KR 

WVICALIGILTGLVACFIDIWENLAGLKYRVI 

KGSILPNIDKFTEKGGLSFSLLLWATLNAAFV 
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F=Phenylalanine. G^Glycine, H=Histidine, 
I=Iso leucine, K=Lysine, L=Leucine. 
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LVGSVIVAFIEPVAAGSGIPQIKCFLNGVKIPH 

VVRLKTLVIKVSGVILSVVGGLAVGKEGPMI 

HSGSVI AAG] SQGRSTSLfCRDFKIFEYFRRDTE 

KRDFVSAGAAAGVSAAFGAPVGGVLFSLEEG 

ASFWNQFLTWRIFFASMISTFTLNFVLSIYHG 

NMWDLSSPGLINFGRFDSEKMAYTIHEIPVFI 

AMGVVGGVLGAVFNALNYWLTMFRTRYIHR 

PCLQVIEAVLVAAVTATVAFVLIYSSRDCQPL 

QGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 

SVVSLFHDPPGSWPLTLGLFTLVYFFLACWT 

YGLTVSAGVFIPSLLIGAAWGRLFGISLSYLTG 

AAIWADPGKYALMGAAAQLGGIVRMTLSLT 

VIMMEATSNVTYGFP1MI ,VLMTAKIVGDVFIE 

GLYDMHIQLQSVPFLHWEAPVTSHSLTAREV 

MSTPVTCLRRREKVGV1VDVLSDTASNHNGF 

PWEHADDTQPARLQGLILRS QLIVLLKHKVF 

VERSNLGLVQRRLRLKDFRDAYPRFPPIQSIH 

VSQDERECTMDLSEFMNPSPYTVPQEASLPR 

VFKLFRALGLRHLVVVDNRNQWGLVTRKD 

LARYRLGKRGLEELbLAQ J OrKAQA 1 AfcCjK V 

AGAAQQPCQLRAVTLEDLGLLLAGGLASPEP 

LSLEELSERY ES SHPTSTAS VPEQDTAKHWNQ 

LEQWWELQAEVACLREHKQRCERATRSLL 

RELLQVRARVQLQGSELRQLQQEARPAAQAP 

EKEAPEFSGLQNQMQALDKREVEVREALTRL 

RRRQVQQEAERRGAEQEAGLRLAKXTDLLQ 

QEEQGREVACGALQKNQEDSSRRVDLEVAR 

M 


854 


2204 


A 


7037 


139 


2604 


AGTWEPRPYDQAKETGAPGSQPPVPPMELRP 

WLLWVVAATGTLVLLAADAQGQKVFTNTW 

AVRIPGGPAVANSVARKHGFLNLGQIFGDYY 

HFWHRGVTKRSLSPHRPRHSRLQREPQVQWL 

EQQVAKRRTKRDV YQEPTDPKFPQQ WYL\SG 

VTQ\RDLMVKAAWAQGYTGHGIWSILDDGI 

EKNHPDLAGNYDPGASFDVNDQDPDPQPRY 

TQMNDNRHGTRCAGEVAAVANNGVCGVGV 

AYNARIGGVRMLDGEVTDAVEARSLGLNPN 

HIHIYSASWGPEDDGKTVDGPARLAEEAFFR 

GVSQGRGGLGSIFVWASGNGGREHDSCNCD 

GYTNSIYTLSISSATQFGNVPWYSEACSSTLA 

TTYSSGNQNEKQIVTTDLRQKCTESHTGTSAS 

APLAAGIIALTLEANKNLTWRDMQHLVVQTS 

KPAHLNANDWATNGVGRKVSHSYGYGLLD 

AGAMVALAQNWTTVAPQRKCIIDILTEPKDI 

GKJILEVRKT VTACL GEPNHITRLEH AQARLT 

LSYNRRGDLAIHLVSPMGTRSTLLAARPHDY 

oAJJOrNL* WArM 1 I Ho WJJc-iJrovjE.Vv VljCltiN 

TSEANNYGTLTKFTLVLYGTAPEGLPVPPESS 

GCKTLTSSQACWCEEGFSLHQKSCVQHCPP 

GFAPQVLDTHYSTENDVETIRASVCAPCHAS 

CATCQGPALTDCLSCPSHASLDPVEQTCSRQS 

OS^RES PPOOOPPRLPPEVEAG ORLRAGLLPS 

HLPEVVAGLSCAFTVLVFVTVFLVLQLRSGFS 

FRGVKVYTMDRGLI SYKGLPPEAWQEECPSD 

SEEDEGRGERTAFIKDQSAL 


855 


2205 


A 


7058 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLA 

ACDWGFDLDHTLCRYNLPESAPLIYNSFAQF 

LVKEKGYDKELLNVTPEDWDFCCKGLALDL 

EDGNFLKLANNGTVLRASHGTKMMTPEVLA 

EAYGKKEWKHFLSDTGMACRSGK^nTFYDN 
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nucleotide insertion 














YFDLPGALLCARVVDYLTKLNNGQKTFDFW"" 
KD I V AA TO HN Y K M S A Fk: FNfTi T VT?pf tv d np n 

RYLHSRPESVKKWLRQLKNAGKILLLITSSHS 

DYCRLLCAVYILGNDFTDLFDIVITNALKPGFF 

SHLPSQRPFRTLENDEEQEAI .PSLDKPG W YSQ 

GNAVHLYELLKKMTGKPEPKWYFGDSMHS 

DIFPARHYSNWETVLILEELRGDEGTRSQRPE 

ESEPLEKKGKYEGPKAKPLNTSSKKWGSFFM 

DSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EAIAELPLD YKFTRFS SSNSKTAG Y YPNPPL V 

LSSDETLISK 


856 


2206 


A 


7082 


396 


1635 


SSPSVFEFEHAVQPVFTMEFLKTCVLRRNACt 
AVCFWRSKWQKPSVRRJSTTSPRSTVMPAW 
VIDKYGKNEVLRFTQNMMMPIIHYPNEVIVK 
VHAASVNPIDVNMRSGYGATALNMKRDPLH 
Vigils. oUfcr rL, 1 LUKJJ V V VMhXJGLDVKYFK 

PGDEVWAAVPPWKQGTLSEFVWSGNEVSH 

KPKSLTHTQAASLPYVALTAWSAINKVGGLN 

DKNCTGKRVL1LGASGGVGTFAIQVMKAWD 

AHVTAVCSQDASELVRKLGADDVIDYKSGSV 

EEQLKSLKPFDFILDNVGGSTETWAPDFLKK 

WSGATYVTLVTPFLLNMDRLGIADGMLQTG 

VTVGSKALKHFWKGVHYRWAFFMASGPCL 

DDIAELVDAGKIRPVMEQTFPFSKVPEAFLKV 

ERGHARGKTVINW 


857 


2207 


A 


7088 

* 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGR 

GHREDFRFCSQRNQTHRSSLHYKPTPDLRISIE 

NSEEALTVHAPFPAAHPASRSFPDPRGLYHFC 

LYWNRHAGRLHLLYGIQRJDFLLSDKASSLLCF 

QHQEESLAQGPPLLATSVTSWWSPQNISLPSA 

ASFTFSFHSPPHTGAHNASVDMCELKRDLQL 

LSQFLKHPQKASRRPSAAPASQQLQSLESKLT 

SVRFMGDMGSFEEDRINATVWKLQPTAGLQ 

DLHIHSRQEEEQSEIMEYSVLLPRTLFQRTKG 

RSGEAEKRLLLVDFSSQALFQDKNSSQVLGE 

KVLGIVVQNTKVANLTEPVYLTFQHQLQPKN 

VTLQCVFWVEDPTLSSPGHWSSAGCETVRRE 

TQTS CFCNHLTYFA VLMVSS VEVDA VHKHY 

RRKPRDYT1KVHMNLLLAVFLLDTSFLLSEPV 

ALTGSEAGCRASAIFLHFSLLTCLSWMGLEG 
YNLYRLVVFVFGTYVPnYl T Kf <J A \/fri wnEDT 

FLVTLVALVDVDNYGPIILAVHRTPEGVIYPS 
MCWIRDSLVSYITNLGLPSLVFLFNMAMLAT 
MVVQILRLRPHTQK WSHVLTLLCLSLVL GVLP 
WALIFFSFASGTFOI VV7 YT F^TTT^FOnPl 7FI 

WYWSMRLQARGGPSPLKSNSDSARLPISSGS 
TSSSRI 


858 


2208 


A 


7091 


185 


415 


DAGAVKSSDTNIWFRGMCDDKKGHRCPS+G - 

QPQHFHVAFHTEAEGAMFYFRLHVIHRVMQS 
QQQLFPSTLFSWLLE 


859 


2209 


A 


7136 


3 


302 


FFFWRQSLALLPRLECSGATGAHCNLHFPGSS 

DCPTS AS* I AGITGAC YHAWLLFVFLAETGFH 

HVG QGGLELLTSSDPSGSASQSAGITGVSHCT 
WPI 


860 


2210 


A 


7156 


23 


591 


ALSTETRTPDMRRLLLVTSLVVVLLWEAGAV 
PAPKVPIKMQVKHWPSEQDPEKAWGARWE 
PPEKDDQLVVLFPVQKPKLLTTEEKPRGQGR 
GPILPGTKA WMETEDTLGRVLSPEPDHDSLY 
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HPPPEEDQGEERPRLWVMPNHQVLLGPEEDQ 
DHIYHPQ* GSRGHHCPRPVPRPRLLGLGPSLP 
CPS 


861 


2211 


A 


/ loi 




lvVJ 


NYVCTI AF*EKKMGF* LSLSCL VLLFVLFLDCI 

LTTTTRIMFHCTYLFASVCLSLLNTLLSPNCL 

KSAMILQ 


862 


2212 


A 


7211 


665 


847 


LKYYHITMGIYKTGKKVIL*KSSMSNRFSVIF 
YKNIQKLSFSNYVYHQNYVFSSDWSYDF 


863 


2213 


A 


7212 


924 


1273 


HGSSCALGDLAPG*LPSGPVLSSPAVRL*RKP 
L V WDSPSCLPATGPT* GLVL VLGGPDCT* W A 
RGQHEHKRMRAP* SCRVTVNL AKKKKKTDQ 
CIKPNYQSPPKECDYNILANSVA 


864 


2214 


A 


7214 


845 


1619 


DPKVPVDADHVQGQDPGRAAHDIHGEDVTE 
wQitrnpT ApnFvnnTnFnHnRHnHRFVGOR 

JX V OISJJ.T L*/vr LfCt V kJLs 1 j^XjOflLirUJ vjl i±vi_i v vj v^iv 

HGHDQEEVAYEERACEGGKFATVEVTDKPV 

DEALREAMPKVAKYAGGTODKGIGMGMTV 

PISFAVFPNEDGSLQKKLKVWFRIPNQFQSDP 

PAPSDKSVKIEEREGITVYSMQFGGYAKEAD 

YVAQATRLRAALEGTATYRGDIYFCTGYDPP 

MKPYGRRNE1WLLKT 


865 


2215 


A 


7246 


559 


682 


RRLGA V AHA YTSSTLGGRGG WIT* GQELQTS 
LANMAKr KJL. i 


866 


2216 


A 


7257 


641 


1310 


TCTYKYLMGW1RGRRSRHSWEMSEFHNYNL 
DLKKSDFSTRWQKQRCPWKSKCRENASPFF 
trrTTT AV AK/truPFITVlVAIW^AVFI N^II FNOFV 

QIPLTESYCGPCPKNWICYKNNCYQFFDESKN 
WYESQASCMSQNASLLKVYSKEDQDLLKLV 
TfQYHWMfiT VHTPTTslGSWOWEDGSILSPNLLT 
IIEMQKGDCALYASSFKGYIENCSTPNTYICM 
QRTV 


867 


2217 


A 


7288 


151 


396 


SIKIIEAFGSNGPDF WFFRY W SP* LFRQQVVF1 
MPFFQTLWLMNANRFCSrFTTTNVANNCWW 
TPYHCWLSWVCRCESHGI 


868 


2218 


A 


7298 


3 


272 


PDTV1GGRGSGGKEFGRWVLW* VFE*RLGTP 
KGSCPAGGSRMVSESD*EGRGC*ASYPCAC* 
AGS* WR* GSRPAGRGTPPRSLSH ARPP 


869 


2219 


A 


7332 


1223 


332 . 


PRRDAEDRDESCLNPAFPIGLLHPNSVNSMAR 
tt rrwi i t t npr:i T Atvw AFP^nnPATPS 

YRLVRPADINFLACVMECEGKLPSLKIWETC 

KELLQLSKPELPQDGTSTLRENSKPEESHLLA 

KRYGGFMKRYGGFMKKMDELYPMEPEEEA 

NGSE1LAKRYGGFMKKDAEEDDSLANSSDLL 

KELLETGDNRERSHHQDGSDNEEEVSKRYGG 

FMRGLKRSPQLKEKAKELQICRY GGFMRRVG 

PQKW*MTSPQNRYGGFLKRFAEALPSDEEGE 

SYSKEVPEMEKRYGGFMRF 


870 


2220 


A 


7382 


216 


-1 A 1 O 

lOlo 


nmAwr TPUTOPI riF<5RR7\JPN<l*OANT I RGPifi 

AGQGRGREGAESGGSRGEGPGSDGRLPATGD 

FWSPRSQRRGCCGRRAPRPEAiMENGAVYSPT 

TEEDPGPARGPRSGLAAYFFMGRLPLLRRVL 

KGLQLLLSLLAFICEEWSQCTLCGGLYFFEF 

VSCSAFLLSLLIUVYCTPFYERVDTTKVKSSD 

FYITLGTGCVFLLASIIFVSTHDRTSAEIAAIVF 

GFIASFMFIXDFITMLYEICRQESQLRKPENTT 

RAEALTEPLNA 


871 


2221 


A 


7403 


3 


393 


SCAMCSGLL*LLLPIWLSWTLGTRGSEPRSVN 
DPGNMSFVKETVDKLLTGFRCFREREAAPRR 
ALRGAALPGESEAGDPESLRSSVNADWIQYS 
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DLWEAEVSTPRCEAGFCQECFRTPGNQEKDG 
PFIC 


872 


2222 


A 


7413 


1061 


359 


FVDIVSVVEFPHCPEARFPAQHGQDSl^RLTLC 
PGGS* PQATLHLDRMRVS ASPTKEIQVKKYK 
CGLIBCPCPANYFAFKICSGAANWGPTMCFED 
RM1MSPVKNNVGRGLNIALVNGTTGAVLGQ 
KAFDMYSGDVMHL VKFLKE1PG GALVLVAS 
YDDPGTTCMNDESRKLFSDLGSSYAKQLGFRD 
SWVFIGAKDLRGKSPFEQFLKEQPQTQNKYE 
GWPELLEMEGCMPPKPF 


873 


2223 


A 


7429 


2242 


2394 


ILKCAGHGGSCL* SQHFGRLRWEDRLRLGVQ 
DHPGQHCETPSLLKIERKLF 


874 


2224 


A 


7468 


146 


894 


PCTSC VLWATLHLPASTRKAPQAECGMI SITE 
WQKIGVGITGFGIFFILFGTLLYFDSVLLAFGN 
LLFLTGLSLIIGLRKTFWFFFQRHKLKGTSFLL 
GGWrVLLRWPLLGMFLETYGFFSLFKGFFPV 
AFGFLGNVCNIPFLGALFRRLQGTSSMV*KTE 
MSSLNLDHWLKGAKREEWEPPPQSPALTHSP 
TYPGPPQVQKERNGAEQLTSNPQVDSRGCQE 
AEMQTPRRLGWGWYHTLTLYLWEEK 


875 


2225 


A 


7498 


91 


251 


GEKPVPTWLQDEAGQWLLGFVAQPWGWPG 
SERHEP* HGG VLFRLGPSAPPGKL 


876 


2226 


A 


7544 


403 


587 


YSCLCFJLFKHITSFKNSVHIWLGTVVHAYNPN 
ILGGQGGWIA*GQEFKTSLGNTVRPCLYK 


877 


2227 


A 


7566 


2 


940 


GC APDTRFFVPEPGGRG AAPWV ALV ARGGC 

TFKDKVLVAARRNASAWLYNEERYGNITLP 

MSHAGTGNIVVIMI SYPKGREILELVQKGIPV 

TMTIGVGTRHVQEFISGQSWFVAIAFITMMII 

SLAWLIFYYIQRFLYTGSQIGSQSHRKETKKVI 

GQLLLHTVKHGEKGIDVDAENCAVCIENFKV 

KDIlRILPCKHIFHRICIDPWLLDHRrCPMCKL 

DVIKALGYWGEPGDVQEMPAPESPPGRDPAA 

NLSLALPDDDGSDESSPPSASPAESEPQCDPSF 

KGDAGENTALLEAGRSDSRHGGPIS 


878 


2228 


A 


7586 


315 


1232 


ERSLLCKVDVRWIYVSEGTKTQRRHRQGSLR 

RGRMQAACWYVLFLLQPTVYLVTCANLTNG 

GKSELLKSGSSKSTLKHIWTESSKDLSISRLLS 

QTFRGKENDTDLDLRYDTPEPYSEQDLWDW 

LRNSTDLQEPRPRAKRRPIVKTGKFKKMFGW 

GDFHSNIKTVKXNLLITGKIVDHGNGTFSVYF 

RHNSTGQGNVSVSLVPPTKIVEFDLAQQTVID 

AKDSKSFNCRIEYEKVDKATKNTLCNYDPSK 

TCYQEQTQSHVSWLCSKPFKVICIYISFYSTD 

YKLVQKVCPDYNYHSDTPYFPSG 


879 


2229 


A 


7605 


479 


391 


TESWKLKWWSPTCLDQLNGSAPGNVFIHG 


880 


2230 


A 


7612 


93 


659 


DAAVAMTAQGGL VANRGRRFK WAIEL SGPG 

GGSRGRSDRGSGQGDSLYPVGYLDKQVPDTS 

VQETDRILVEKRCWDIALGPLKQIPMNLFIMY 

MAGNTISIFPTMMVCMMAWRPIQALMAISAT 

FKMLESSSQKFLQGLVYLIGNLMGLALAVYK 

CQSMGLLPTHASDWLAFIEPPERMEFSGGGL 
LL 


881 


2231 


A 


7615 


291 


1452 


SPQKTMRSHTITMTTTSVSS WP YSSHRMRFIT 

NHSDQPPQNFSATPNVTTCPMDEK1XSTVLTT 

SYSVIFIVGLVGNIIALYVFLGIHRKRNSIQIYL 

LNVA1ADLLL1FCLPFRJMYHINQNKWTLGVIL 

CKWGTLFYMNMYISIILLGF1SLDRYIKINRSI 

QQRKAITTKQSIYVCCIVWMLALGGFLTMIIL 

TLKKGGHNSTMCFHYRDKHNAKGEAIFNFIL 



267 



WO 01/57188 



PCT7US01/03800 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NU: 
in 

USSN 

09/496 

914 


Predicted 

beginning 

.nucleotide 

location 

correspondi 

ng to first 

EUninO at. Ill 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
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T~"\ - — A cntkrti^ A /-^ T7 — 1 1 ifnin l f* A r»t/4 
LI — /\i>piil L1C E> — VJ lUltillllV* /AvIU, 
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VVMFWl IFI T TIT SYIKTOKNI 1 RT^KRRSKFPN 

SGKYATTARNSFIVLIIFTICFVPYHAFRFIYISS 

OLNVSSCYWKEIVHKTNEIMLVLSSFNSCLDP 

VMYFLMSSNIRKJMCQLLFRRFQGEPSRSEST 

SEFKPGYSLHDTSVAVKIQSSSKST 




LLjL 


A 

r\ 


7^1 7 


O l 


37Q 


ROMALLKANKDLISAGLKEFSVLLNOOVFND 
PLVSEEDMVTVVEDWMNFYINYYRQQVTGE 
PQERDKALQELRQELNTLANPFLAKYRDFLK 
SHELPSHPPPSS 


883 


2233 


A 


7622 


400 


215 


KVKTCRYNPKYSAANDTGFVDIPSREKDLAK 
AVATVGPISVAVGASHVFFQFYKKGKHLSS 


884 


2234 


A 


7638 


2640 


2861 


APVLILQMVKLSIVLTPQFLSHDQGQLTKELQ 
QHVKSVTCPCEYLRKVSECRQMGPGALEQFP 
GLSCHTSHSG 


885 


2235 


A 

A 


/64z 


201 




DCDr,VX>fl71 P AK/fQT? VTQPVMPAVPPHT TWT I 
r oKOJSJVuiLJi/vlYloiv i i or V lNr/\ V r rnL 1 v v LrL 

AI GMFFTA WFF V YE VTSTKYTRDIYKELLI SL 
VASLFMGFGVLFLLLWVGIYV 


886 


2236 


A 


7692 


61 


569 


A t>T7MOTJQT>OIJ'I7VTQPXV\7I^"T CT VTHTWl fiTvrt-IA 
AriiNrroKl^ririNor. 1 JS. VNJLMjI\. 1 Ul WLOiNrl/v 

HLGEHFSTHHELGLSGKWGFLVKNILEVIRN 

GGMETRHPGKVS SWFHRWDSRAEQHNHAE 

HHEDVPQGDEDSKVSEAQQEFPDVVTCAGLP 

GLLPKALRVLLFQLKVQHRPGIHQQRPEQQD 

VSDHRYGRSVRQNRK 


887 


2237 


A 


7693 


85 


315 


NPGCCLPVAMRTSYLLLFTLCLLLSEMASGG 

NFLTGLGHRSDHYNCVSSGGQCLYSACPIFTK 

IQGTCYRGKAKCCK 


888 


223S 


A 


7702 


242 


1298 


APSHRRR YLSPSRSAGQLGNMALERLC SVLK 
VLLITVLVVEGIAVAQKTQDGQNIGIKHIPAT 
QCGIWVRTSNGGHFASPNYPDSYPPNKECIYI 

TEA A t>T> /~\T> TCI TTTrMJIJ WltTPCnPfDTTPll-Jl 17 W D 

rL/fc-ri Y i Ini^rfcUl^lJnljliVK 
DGPFGFSPL1DRYCGVKSPPLIRSTGRFMWIKF 
SSDEELEGLGFRAKYSFIPDPDFTYLGGILNPIP 
DCQFELSGADGIVRSSQVEQEEKTKPGQAVD 
C1WTIKATPKAKIYLRFLDYQMEHSNECKRNF 
VAVYDGSSSIENLKAKFCSTVANDVMLKTGI 
GVIRMWADEGSRLNRFRMLFTSFGGASPAQA 
ALSFCHSNMCINNSLVCNGVQNCAYPWDEN 
HC 


889 


2239 


A 


7707 


185 


2911 


CHY1MNPSTHHPASAGGS1LGLFDFFGLGLGE 

MTMDALLARLK1XNPDDLREEIVKAGLKCGP 

ITSTTRBFEKKLAQALLEQGGRLSSFYHHEA 

GVTALSQDPQRBLKPAEGNPTDQAGFSEDRDF 

GYSVGLNPPEEEAVTSKTCSVPPSDTDTYRAG 

ATA SKEPPLYYG VCPVYEDVP ARNERI YV YE 

NKKEAL Q AVKM 1KGSRFKAF STRE D AEKF AR 

GICDYFPSPSKTSLPLSPVKTAPLFSNDRLKDG 

LCLSESETVNKERANSYKNPRTQDLTAKLRK 

AVEKGEEDTFSDLIWSNPRYUGSGDNPTTVQ 

EGCRYNVMHVAAKENQASICQLTLDVLENP 

DFMRLMYPDDDEAMLQKRIRYWDLYLNTP 

VKNSRNKYDKTPEDVICERSKNKSVELKERIR 

EYLKGHYYVPLLRAEETSSPVIGELWSPDQTA 

EASHVSRYGGSPRDPVXTLRAFAGPLSPAKAE 

DFRKLWKTPPREKAGFLHHVKKSDPERGFER 

VGRELAHELGYPWVEYWEFLGCFVDLSSQE 

GLQRLEEYLTQQFJGKKAQQETGEREASCRD 

KATTSGSNSISVRAFLDEDDMSLEE1KNRQNA 

ARNNSPPTVGAFGHTRCSAFPLEQEADLLEAA 
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EPGGPHSSRNGI PHPI NT-T^RTI AfUCRPRT APR 

GEEAHLPPVSDLTVEFDKLNLQNIGRSVSKTP 

DESTKTKDQILTSRINAVERDLLEPSPADQLG 

NGHRRTESEMSARJAKMSLSPSSPRHEDQLEV 

TREPARRLFLFGEEPSKLDQDVLAALECADV 

DPHQFPAVHRWKSAVLCYSPSDRQSWPSPAV 

KGRFKSQLPDLSGPHSYSPGRNSVAGSNPAKP 

GLGSPGRYSPVHGSQL RRMARLAELAAL 


890 


2240 


A 


7711 


360 


269 


RHMPVIPALWEAEVGGLLEPRSSRSAWATE 


891 


2241 


A 


7721 


61 


1175 


KLPWEPSFLIKMQIIRHSEQTLKTALISKNPVL 

VSQYEKLDAGEQRLMNEAFQPASDLFGPITL 

HSPSDWITSHPEAPQDFEQFFSDPYRKTPSPN 

JVKol i lyblUMAjJN 1 KliolMi YlJvWL 1 u YLKAYr 

YGLRVKLLEPVPVSVTRCSFRVNENTHNLQIH 

AGDILKFLKKKKPEDAFCWGITMIDLYPRDS 

WNFVFGQASLTDGVGIFSFARYGSDFYSMHY 

KGKVKKLKKTSSSDYS1FDNYYIPE1TSVLLLR 

SCKTLTHEIGHIFGLRHCQWLACLMQGSNHL 

EEADRRPLNLCPICLHKLQCAVGFSIVERYKA 

LVRWIDDESSDTPGATPEHSHEDNGNLPKPV 

EAFKEWKEWIIKCLAVLQK 


892 


2242 


A 


7723 


2 


1650 


SAPTAPARPCRAERGSGGGMLALLAASVALA 

VAAGAQDSPAPGSRFVCTALPPEAVHAGCPL 

PAMPMQGGAQSPEEELRAAVLQLRETVVQQ 

KETLASARAIRELTGKLARCEGLAGGKARGA 

GATGKDTMGDLPRDPGHWEQLSRSLQTLK 

DRLESLEPLPAMPMQGGAQSPEEELRAAVLQ 

LRETVVQQKETLASARAIRELTGKLARCEGL 

AGGKARGAGATGKDTMGDLPRDPGHVVEQ 

LSRSLQTLKDRLESLEHQLRANVSNAGLPGD 

r Kb V L.y \l KLu JaLbKQL-LKKO AbL hD bKS LLH 

NETSAHRQKTESTLNALLQRVTELERGNSAF 

KSPNAFKVSLPLRTNYLYGKUECKTLPELYAFT 

ICLWLRSSASPGMGTPFSYAVPGQANEIVLIE 

WGNNPIELLINDKVAQLPLFVSDGKWHHICV 

TWTTRDGMWEAFQDGKKLGTGENLAPWHPI 

KPGGVLILGQEQDTVGGRFDATQAFVGELSQ 

FNIWDRVLRAQEIVNIANCSTNMPGNIIPWD 

NNVDVFGGASKWPVETCEERLLDL 


893 


2243 


A 


7729 


3554 


2419 


LTAGTAMNYPLTLEMDLENLEDLFWELDRL 

DNYNDTSLVENHLCPATEGPLMASFKAVFVP 

VAYSLIFLLGVIGNVLVLVDLERHRQTRSSTET 

LCKTVIALHKVNFYCSSLLLACIAVDRYLAIV 
HAVHAYRHRRLLSIH1TCGTIWLVGFLLALPEI 
LFAKVSQGHHNNSLPRCTFSQENQAETHAWF 
TSRFLYHVAGFLLPMLVMGWCYVGVVHRLR 

OAfYRRPOROKAVRVATI VT^IFFT PWQDVmv 
v^^vv^rvrvjr v^jvv^rv/A vi\ v AVil_> v l jirr Lv^ W or I Jell V 

IFLDTLARLKAVDNTCKLNG SLPVAITMCEFL 
CTGPASLCQLFPSWRRSSLSESENATSLTTF 


894 


2244 


A 


7738 


670 


287 


FVTRAGRWGAGARVRGGAGGMASGAARWL 
VLAPVRSGALRSGPSLRKDGDVSAAWSGSGR 
SLVPSRSVIVTRSGAILPKPVKMSFGLLRVFSI 
VIPFLYVGTLISKNFAALLEEHDI FVPEDDDDD 
D 


895 


2245 


A 


7753 


119 


278 


APYAHSQVHCLDKVCGLLPFLNPEVPDQFYR 
LWLSLFLHAGKEAPHCPRTRPL 


896 


2246 


A 


7754 


1 


372 


SPAWWNSQQRWSPFLALLTLEPTFHHLLP1M 
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QVSTAALAVLLCTMALCNQVLSAPLAADTPT 
ACCFSYTSRQ1PQNF1ADYFETSSQCSKPSVIFL 
TKRGRQVCADPSEEWVQKYVSDLELSA 


897 


2247 


A 

i 


7761 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPL 
RPLSRHPLSSGSPETSAAAIMLLTVRHGTVRY 
RSSALLARTKNNIQRYFGTNSV1CSKKDKQSV 
RTEETSKETSESQDSEKJEKTKKDLLGHKGMK 
VELSTVNVRTTKPPKRRPLKSLEATLGRLRRA 

TEY ArK KKJ fcr Lb rJiL V AAAo A V s\Do L,rrUiS\£ 

TTKSELLSQLQQHEEESRAQRDAKRPKISFSNI 
ISDMKVARS AT ARVRSRPELRJQFDEG Y DN YP 
GQEKTDDLKJCRKN IFTGKRLNIFDMM A VTTCE 
APETDTSPSLWDVEFAKQLATVNEQPLQNGF 
EiiLiQ W 1 KtAjkL. W c,r r I in iNtiALir uuuuai,rn 
EHIFLEKHLESFPKQGP1RHFMELVTCGL SKNP 
YLSVKQKVEHIEWFRNYFNEKKDILKESNIQF 
KLRPWKFLFRNN 


898 


2248 


A 


7775 • 


85 


496 


SCQTTQPPAQSCSTGTMRIMLLFTAILAFSLA 
QSFGAVCKEPQEEWPGGGRSKRDPDLYQLL 

rtriT ri/cncci T7/— >t i 17 ai C/~\ A PTT^DV DPTQ DCV 

QRLFKSHSSLhCjLLPvALbQAb I DrMsiso I SrfcJs. 
RDMHDFFVGLMGKRSVQPDSPTDVNQENVP 
SFGILKYPPRAE 


899 


2249 


A 


7785 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFI 

FSKSMNESMKNQKEFMLMNARLQLERQLIM 

QSEMRERQMAMQIAWSREFLKYFGTFFGLA 

AISLTAGAIKKKKPAFLVPIVPLSF1LTYQYDL 

GYGTLLERMKGEAEDILETEKSKLQLPRGMIT 

FESIEKARKEQSRFFIDK 


900 


2250 


A 


7789 


1465 


300 


VWLPLKSYKIRSPSLHCQCE1FREEFLFSSLQE 

GRDKDTFSKMAMVSEFLKQAWFIENEEQEY 

VQTVKSSKGGPGSAVSPYFTFNPSSDVAALH 

KAIMVKGVDEAT1IDILTKRNNAQRQQIKAAY 

LQETGKPLT)ETLKKAL 1 OHLbh V VLAL.LK 1 r 

AQFDADELRAAMKGLGTDEDTLIEILASRTN 

KEIRD1NRVYREELICRDLAJCDITSDTSGDFRN 

ALLSLAKGDRSEDFGVNEDLADSDARALYEA 

GERRKGTDVNVFNTILTTRSYPQLRRVFQKY 

TKYSKHDMNKVLDLELKGDIEKCLTAIVKCA 

TSKPAFFAEKLHQAMKGVGTRHKALIRIMVS 

RSEIDMNDIKAFV'QKMYGISLCQAILDETKGD 

YEKILVALCGGN 


901 


2251 


A 


7796 


2 


807 


VEFHrQRAKAOAIsAJ^MU VJL.L1 L/L-oJjVL. 

ALLFPSMASMAA1GSCSKEYRVLLGQLQKQT 

DLMQDTSRLLDPY1R1QGLDVPKLREHCRERP 

G AFP S EETLRG I .GRR CFLQTLN AT LGC VLHRL 

ADLEQRLPKAQDLERSGLNIEDLEKLQMARP 

NILGLRNNIYCMAQLLDNSDTAEPTKAGRGA 

SQPPTPTPA SDAFQRKLEGCRFLHG Y HRFMH 

SVGRVFSKWGESPNRSRRHSPHQALRKGVRR 

TRPSRKGKRLMTRGQLPR 


902 


2252 


A 


78U2 


n 
Z 


10 1 

1 £. 1 


T A ARRROKGT A ARRLQKGT AARRRQKGTAA 

RRRQKGTAARRPQKGTAARRRQKGTAARRR 

QKGTAARRRQKGTAARRPQKGTAARRRQKG 

TAARRRQKGTAARRRQKGLAIASRGCPCASR 

AGGVRGAGSRLRAMAPKVFRQYWDIPDGTD 

CHRKAYSTTS1ASVAGLTAAAYRVTLNPPGTF 

LEGVAKVGQYTFTAAAVGAVFGLTTCISAHV 

REKPDDPLNYFLGGCAGGLTLGARTHNYGIG 

AAACVYFGIAASLVKMGRLEGWEVFAKPKV 
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903 


2253 


A 


7807 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMA 

VSTVFSTSSLMLALSRHSLLSPLLSVTSFRKFY 

RGDSPTDSQKDMIElPLPPWQERTDESIEnCR 

ARLLYESRKRGMLENCILLSLFAKEHLQHMT 

EKQLNLYDRLIN EPSND WDIY Y W ATE AKP AP 

EIFENEVMALLRDFAKNKNKEQRLRAPDLEY 

LFEKPR 


904 


2254 


A 


7813 


40 


821 


GAGRALGHLETGAGDVAAALPARKFPRSLLG 

AGARLTGWTMNVFRILGDLSHLLAMILLLGK 

IWRSKCCKGISGKSQILFALVFTTRYLDLFTNF 

1S1YNTVMKVVFLLCAYVTVYMIYGKFRKTF 

DSENDTFRLEFLLVPVIGLSFLENYSFTLLE1L 

WTFSIYLESVAILPQLFMISKTGEAETITTHYL 

FFLGLYRALYLANWIRRYQTENFYDQIAVVS 

G W Q1TFYCDFFYL Y VTKGRS WDDSNADTGL 

RSYSSI 


905 


2255 


A 


7817 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEA 

QTEMVRTLERKLEAKMIKEESDYHDLESVVQ 

QVEQNLELMTKRAVKAENHVVKLKQEISLL 

QAQVSNFQRENEALRCGQGASLTVVKQNAD 

VALQNLRVVMN SAQASIEQL VSG AETLNLVA 

EILKSIDRISEVKDEEEDS 


906 


2256 


A 


7822 


3 


1462 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDAL 

LSDLETTTSHMPRSGAPKERPAEPLTPPPSYG 

HQPQTGSGESSGASGDKDHLYSTVCKPRSPK 

PAAPAAPPFSSSSGVLGTGLCELDRLEQELNA 

TQFNITDEIMSQFPSSKVASGEQKEDQSEDKK 

RPSLPSSPSPGLPKASATSATLELDRLMASLSD 

FRVQNHLPASGPTQPPVVSSTNEGSPSPPEPTG 

KGSLDTMLGLLQSDLSRRGVPTQAKGLCGSC 

NKPI AG Q V VT AE GRA WHPEHF VC G G C STAL 

GGSSFFEKDGAPFCPECYFERFSPRCGFCNQPI 

RHKMVTALGTHWHPEHFCCVSCGEPFGDEG 

FHEREGRPYCRRDFLQLFAPRCQGCQGPILDN 

YISALSALWHPDCFVCRECFAPFSGGSFFEHE 

GRPLCENHFHARRGSLCATCGLPVTGRCVSA 

LGRRFHPDHFTCTFCLRPLTKGSFQERAGKPY 

CQPCFLKLFG 


907 


2257 


A 


7828 


1792 


1671 


FIYVNQSFAPSPDQEVGTLYECFGSDGKLVLH 
YOCSQAWG 


908 


2258 


A 


7842 


110 

■ 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLG 

SLQPPPSGLKQSSHLSLSSSWDFRHAPTHPET 

YTCPKMIEMEQAEAQLAELDLLASMFPGENE 

LIV>JDQLAVAELKDCIEKKTMEGRSSICVYFTI 

NMNLDVSDEKMAMFSLACILPFKYPAVLPEI 

TVRSVLLSRSQQTQLNTDLTAFLQKHCHGDV 

CILNATEWVREHASGYVSRDTSSSPTTGSTVQ 

SVDLIFTRLWIYSHHIYNKCKRKNILEWAKEL 

SLSGFSMPGKPGWCVEGPQSACEEFWARLR 

KLNWKiuLIRiiRjEDIPFDGTNDETERQRJCFS 

EEKVFSVNGARGNHMDFGQLYQFLNTKGCG 

DVFQMFLWV 


909 


2259 


A 


7870 


3067 


2923 


EGICVYTFIYVHMYTRTCMHTYPYMYMNSV 
LIS SEILLIPSKYLFESK 


910 


2260 


A 


7884 


212 


4874 


GALTWSHPLLAVCPQGVWLGSTPSGSPAELP 
PSHRVNAEPGCVVTNACASGPCPPHANCRDL 
WQTFSCTCQPGYYGPGCVDACLLNPCQNQG 
SCRHLPGAPHGYTCDCVGGYFGHHCEHRMD 
QQCPRGWWGSPTCGPCNCDVHKGFDPNCNK 
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TNGQCHCKEFHYRPRGSDSCLPCDCYPVGST 

SRSCAPHSGQCPCRPGALGRQCNSCDSPFAEV 

TASGCRVLYDACPKSLRSGVWWPQTKPGVL 

ATVPCPRG AJLGLRG AGAA VRLCDEAQG WLE 

PDLFNCTSPAFRELSLLLDGLELNKTALDTME 

AKKLAQRLREVTGHTDHYFSQDVRVTARLL 

AHLLAFESHQQGFGLTATQDAHFNENLLWA 

GSALLAPETGDLWAALGQRAPGGSPGSAGLV 

RHLEEYAATLARNMELTYLNPMGLVTPNIML 

SIDRMEHPSSPRGARRYPRYHSNLFRGQDAW 

DPHTHVLLPSQSPRPSPSEVLPTSSSIENSTTSS 

WPPPAPPEPEPGISIIILLVYRTLG GLLPAQFQ 

AERRGARLPQNPVMNSPVVSVAVFHGRNFLR 

GILESPISLEFRJLLQTANRSKAICVQWDPPGLA 

EQHGV WTARDCELVHRNG SHARCRCSRTGT 

FGVLMDASPRERLEGDLELLAVFTHVVVAVS 

VAALVLTAAILLSLRSLKSNVRGIHANVAAA 

LG V AELLFLLGIHRTHNQI ,VCTA VVILLHYFF 

LSTFAWLFVQGLHLYRMQVEPRNVDRGAMR 

FYHALGWGVPAVLLGLAVGLDPEGYGNPDF 

CWISVHEPLIWSFAGPVVLVIVMNGTMFLLA 

ARTSCSTGQREAKKTSALTLRSSFLLLLLVSA 

SWLFGLLAVNHSILAFHYLHAGLCGLQGLAV 

LLLFCVLNADARAAWMPACLGRKAAPEEAR 

PAPGLGPGAYNNTALFEESGLIRITLGASTVSS 

VSSARSGRTQDQDSQRGRSYLRDNVLVRHGS 

AADHTDHSLQAHAGPTDLDVAMFHRDAGA 

DSDSDSDLSLEEERSLSIPSSESEDNGRTRGRF 

QRPLCRAAQSERLLTHPKDVDGNDLLS Y WPA 

LGECEAAPCALQTWGSERRLGLDTSKDAAN 

NNQPDPALTSGDETSLGRAQRQRKGILKNRL 

QYPLVPQTRGAPELSWCRAATLGHRAVPAAS 

YGRIYAGGGTGSLSQPASRYSSREQLDLLLRR 

QLSRERLEEAPAPVLRPLSRPGSQECMDAAPG 

RLEPKDRGSTLPRRQPPRDYPGAMAGRFGSR 

DALDLGAPRE WLSTLPPPRRTRDLDPQPPPLP 

LSPQRQLSRDPLLPSRPLDbLbKo b JN bK-fc vLlAJ 

VPSRHPSREALGPLPQLLRAREDSVSGPSHGP 

STEQLDILSSILASFNSSALSSVQSSSTPLGPHT 

TATPSATASVLGPSTPRSATSHSISELSPDSEPR 

DTQALLSATQAMDLRRRDYHMERPLLNQEH 

LEELGRWGSAPRTHQWRTWLQCSRARAYAL 

LLQHLPVLVWLPRYPVRDWLLGDLLSGLSVA 

IMQLPQGLAYALLAGLPPVFGLYSSFYPVFIY 

FLFGTSRHI S VESLCV PGPVDT 


911 


2261 


A 


7890 


21 


806 


EFGTSRSoKSMAhDLtjjLorvjb lAo VcMLrc-rlO 

SCRPKARSSSARWALTCCLVLLPFLAGLTTYL 

LVSQLRAQGEACVQFQALKGQEFAPSHQQV 

YAPLRADGDKPRAHLTVVRQTPTQHFKNQFP 

ALHWEHELGLAFTKNRMNYTNKFLLIPESGD 

YFIYSQVTFRGMTSECSEIRQAGRPNKPDSITV 

VTTKVTDSYPFPTOLLMGTKSVCEVGSNWFQ 

PIYLGAMFSLQEGDKLMVNVSDISLVDYTKE 

DKTFFGAFLL 


912 


2262 


A 


7891 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFL 

LILALGQAVQFQEYVFLQFLGLDKAPSPQKFQ 

PVPYILKKJFQDREAAATTGVSRDLCYVKELG 

VRGNVLRFLPDQGFFLYPKKISQASSCLQKLL 

YFNLSAIKEREQLTLAQLGLDLGPNSYYNLGP 

ELELALFLVQEPH V WGQTTPKPGKMF VLR SV 
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seq- 
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seq- 
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SEQ 
ID NO; 
in 
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nucleotide 
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ng to first 

amino acid 

residue of 

peptide 
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Predicted end 
nucleotide 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alamne 0=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L-Leucine, 
M=Methionine > N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine» S=Serine, 
T-Threonine, V-Valine, W=Tryptophan, 
i— lyrosine, a.— uns.no wn, — jiop coaon, 
/=possibIe nucleotide deletion, \=possible 
nucleotide insertion 














PWPQGAVHFNLLDVAKDWNDNPRKKFGLFL 

EILVKEDRDSGVNFQPEDTCARLRCSLHASLL 

VVTLNPDQCHPSRKRRAAIPVPKLSCKNLCH 

RHQLFINFRDLGWHKWIIAPKGFMANYCHGE 

CPFSLTISLNSSNYAFMQALMHAVDPEIPQAV 

CIPTKLSPISMLYQDNNDNVILRHYEDMVVD 

CPPPP 


913 


2263 


A 


7892 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFAL 
YLJLb 1 KJLrROKKJLuS 1 bbAuOKSLWr PbDLAb 
LRELSEVLREYRKEHQAYVFLLFCGAYLYKQ 
GFAIPGSSFLNVLAGALFGPWLGULLCCVLTS 
VGATCCYLL SSIFGKQL WS YFPDKVALLQR 
KVEENRNSLFFFLJLFLRLFPMTPNWFLNLSAPI 
LNIPIVQFFFSVL1GLIPYNF1CVQTGSILSTLTS 
LDALFSWDTVFKLLAIAMVALIPGTUKKFSQ 
KilLQLNEl o 1 ANHIH bKKD 1 


914 


2264 


A 


7893 


815 


959 


KSGWVWWLTPLIPALWEAQTEGSLRPEVKN 
RLSN1TRPFFSKKKKILV 


915 


2265 


A 


7909 


3 


641 


HASGPGGLLRRRRGSGANMPVARSWVCRKT 

YVTPRRPFEKSRLDQELKLIGEYGLRNKREV 

WRVKFTLAK1RKAARELLTLDEKDPRRLFEG 

NALLRRLVRIGVLDEGKMKLDYILGLK1EDFL 

ERRLQTQVFKLGLAKSIHHAHVLIQQCHIRVR 

EQWNILFFTVRLDSQKHIDFSLCFPIGVANPS 

HVKRKNASKGQGGAGARDDEEEE 


916 


2266 


A 


7914 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHAC 

QVLILKHTHASLSLPSCQECFPSSIPSASHMVS 

HPHPPPSPRWGQTPEGLPAASPCGPGPRSCFS 

SILPTGDSWGMLACLCTVLWHLPAVPALNRT 

GDPGPGPSIQKTYDLTRYLEHQLRSLAGTYLN 

YLGPPFNEPDFNPPRLGAETLPRATVDLEVW 

RSLNDKLRLTQNYEAYSHLLCYLRGLNRQAA 

TAELRRSLAHFCTSLQGLLGSIAGVMAALGY 

PLPQPLPGTEPTWTPGPAHSDFLQKMDDFWL 

LKELQTWLWRSAKDFNRUCKKMQPPAAAVT 

LHLGAHGF 


917 


2267 


A 


7921 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEITCRL 

HQYDGSIVVIQNPARQTLFFNGTRALKDERFQ 

LEEFSPRRVRIRLSDARLEDEGGYFCQLYTED 

THHQIATLTVLVAPENPVVEVREQAVEGGEV 

ELSCLVPRSRPAATLRWYRDRKELKGVSSSQ 

ENGKVWSV ASTVRFRVDRKDDGGU1CEAQN 

QALPSGHSKQTQYVLDVQYSPTARIHASQAV 

VREGDTLVLTCAVTGNPRPNQIRWNRGNESL 

PERAEAVGETLTLPGLVSADNGTYTCEASNK 

HGHARALYVLVVYGESRLRPTEGGGGAPDP 

GAVVEAQTSVPYAIVGGILALLVFLIICVLVG 

MVWCSVRQKGSYLTHEASGLDEQGEAREAF 

LNGSDGHKRKEEFFI 




z2oo 


A 

A 


7938 


J 


io33 


DDDI nD A PDDCOCUOPCl CDC A ITWKjf A C>T> U/CTV 

KKKLFr Abrrbbb V bbbLbrbA V VMALR WS 1 K. 

ESPRWRSALLLLFLAGVYGNGALAEHSENVH 

ISGVSTACGETPEQIRAPSGI1TSPGWPSEYPAK 

IN CS WF] RANP GEI1TI S FQDFD1QG SRRCNLD 

WLTIETYKN1ESYRACGSTTPPPYISSQDHIWIR 

FHSDDN1SRKGFRLAYFSGKSEEPNCACDQFR 

CGNGKC1PEAWKCNNMDECGDRSDEEICAKE 

ANPPTAAAFQPCAYNQFQCLSRFTKVYTCLP 

ESLKCDGNIDCLDLGDEIDCDVPTCGQWLKY 

FYGTFNSPNYPDFYPPGSNCTWLIDTGDHRK 
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seq- 
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seq- 
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nucleotide 
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sequence 
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nucleotide 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine C-Cysteine, 
u— Asparuc /vcio, t = \jiuiarnic a.ciu, 
F=PhenyIalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K.=Lysine, L-Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^Olutamine, R=Arginine, S^Serine. 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possibIe 
nucleotide insertion 














VILRFTDFKLDGTGYGDYVKJYDGLEENPHK 

LLRVLTAFDSHAPLTVVSSSGQIRVHFCADKV 

NAARGFNATYQVDGFCLPWEIPCGGNWGCY 

TEQQRCDGYWHCPNGRDETNCTMCQKEEFP 

CSRNGVCYPRSDROIYQNHCPNGSDEKNCFF 

CQPGNFHCKNNRCVFESWVCDSQDDCGDGS 

DEENC PVI VPTRVITAA V1GSLICGLLL V I ALG 

CTCKLYSLRMFERRSFETQLSRVEAELLRREA 

PPSYGQLIAQGLIPPVEDFPVCSPNQASVLENL 

DI AYTDCflT /"3FTQVT>T V>\A AP.l? QQWIAX/MPTFMF A 

.KLA YKol^LUr 1 o VKJUriVlALiKoolNi WiNKJr INr A. 
RSRHSGSLALVSADGDEVVPSQSTSREPERNH 

1 rLKoL.ro V CoUU i JJ 1 iliNE/KKlJlVlj c \0/\oO \J V /\/\ 

PLPQKVPPTTAVEATVGACASSSTQSTRGGH 
ADNGRDVTSVEPPSVSPARHQLTSALSRMTQ 
GLR W VRFTLGRS SSLSQNQSPLRQLDNG V SG 
REDDDDVEMLIPISDGSSDFDVNDCSRPLLDL 
ASDQGQGLRQPYNATNPGVRPSNRDGPCERC 


919 


2269 


A 


7951 


1674 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVAS 
GG WNDV ACHTTMYFMCEFDKKNM 


920 


m n 

2270 


A 


7953 


A '"I 

47 


572 


OuKAb Wrbl^AJMirKKliLjlil UJS-vv * t-UVLAA 
GLRCLPHLPAICARRMSPAFRAMDVEPRAKG 

VLLEPFVHQVGGHSCVLRFNETTLCKPLVPRE 
uneven pa ux/tp vpTPnwr,k r Qni t vni pwvx/ 

RGDVRDRGHGRPWQPSLEPSLPPTLCFPSLSS 
FSSSWPSAQHLTPSVFNPW 


921 


2271 


A 


7957 


612 


812 


RSGRT WTGI G YSKALQ SSNRNTKSLLQNEF 

MMVYSFRALSFKESTWATFQHGGEATKSRSL 

SSTQ 


922 


2272 


A 


7967 


1443 


1660 


ENITEKWKEIWMCRGNKKSCCWTF1KDRHLT 
VSCCKSKSGETLLICIFCSNLVGFFFFGIRGFSN 
WELVKPN 


923 


2273 


A 


7981 


1 


3023 


G S APRAATAM ARARPPPPPSPPPG LLPLLPPLL 

LLPLLLLPAGCRALEETLMDTKWVTSELAWT 

SHPESGWEEVSGYDEAMNPIRTYQVCNVRES 

SQNNWLRTGFIWRRDVQRVYVELKFTVRDC 

NSIPNIPGSCKETFNLFYYEADSDVASASSPFW 

MENPYVKVDTIAPDESFSRLDAGRVNTKVRS 

FGPLSKAGFYLAFQDQGACMSLISVRAFYKK 

CASTTAGFALFPETLTGAEPTSLV1APGTCIPN 

AVEVSVPLKLYCNGDGEWMVPVGACTCATG 

HEPAAKESQCRPCPPGSYKAKQGEGPCLPCPP 

NSRTTSPAASICTCHNNFYRADSDSADSACTT 

VPSPPRG VI SNVNETSLILEW SEPRDLG VRDD 

LLYNVICKKCHGAGGASACSRCDDNVEFVPR 

QLGLSEPRVHTSHLLAHTRYTFEVQAVNGVS 

GKSPLPPRYAAVNITTNQAAPSEVPTLRLHSS 

SGSSLTLSWAPPERPNGVILDYEMK.YFEKSEG 

IASTVTSQMNSVQLDGLRPDARYWQVRART 

VAGYGQYSRPAEFETTSERGSGAQQLQEQLP 

1 TVfi^ATAHT VFVVAVWTAIVPT TtTCORHGS 
v vj o/ \ i v r v v /\ v v v i/vi v v^i^i\~ix\^ivrivjo 

DSEYTEKLQQYIAPGMKVY1DPFTYEDPNEA 

VREFAKE1DVSCVKJEEVIGAGEFGEVCRGRL 

KQPGRREVFVAIKTLKVGYTERQRRDFLSEA 

SIMGQFDHPNIIRLEGVVTKSRPVMLLTEFME 

NCALDSFLRLNDGQFTVIQLVGMLRGIAAGM 

KYLSEMNYVHRDLAARNILVNSNLVCKVSDF 

GLSRFLEDDPSDPTYTSSLGGKJPIRWTAPEAI 

AYRKFTSASDVWSYGIVMWEVMSYGERPY 
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INU: 01 
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SEQ ID 

Vs\J. 01 
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seq- 
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SEQ 
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iColUUC OI 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A— Alanine C=Cy stein e, 
D=Asparuc Acid, E=GIutamic Acid, 
F=PhenyIalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine,N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, 
i— lyrosine, A=Unknown, *=Mop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














WDMSNQDVINAVEQDYRLPPPMDCPTALHQ 
LMLDCWVRDRNLRPKPSQIVNTLDKLIRNAA 
SLKVIASAQSGMSQPLLDRTVPDYTTFTTVGD 
W LUAlKJVLuK Y fvbbr V b AGr Abr DL V AQMTA 
EDLLR1GVTLAGHQKKILSSIQDMRLQMNQT 

LrV^V 


924 


2274 


A 


7985 


1 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNF 

QLMRELDQRTEDKKAEIDILAAEYISWKTLS 

PDQRVERLQKIQNAYSKCKEYSDDKVQLAM 

fYWTZ'KAXfTWTJTO'Dl T\ A TM A r> T7T? A FAI i/rwx <Pn 

yJ i rJVL V UKri IRRLDADLARr hADLKDKMEG 
SDFESSGGRGLKKGRGQKEKRGSRGRGRRTS 
EEDTPKKKKHKGG 


925 


2275 


A 


7994 


447 


589 


LPCSFCAQCMSSFERVWLQQSHFHNPRWNSR 
oFIKC Y CQHWPHCVHC 


926 


2276 


A 


7996 


925 


582 


GPCKVCCITLAIMLQCHSFYRKDVQVEHPKS " 
LNPKYSQIENFLSADMALKRKCLLSISDLDFW 
IWDAQPVGIMQTLQNLKKIPNPGCFWSQAFQI 
RDTQPILPLGG R YY ITIRQ 


GO 7 


zz// 


A 


vyys 


2 


353 


RIQRPLNSRSPNHSLFVKAELTAKQATMfCLSV 
CLLLVTLALCCYQANAEFCPALVSELLDFFFI 
S EPLFKL S L AKFD AP PE A V A AKL G VKRCTDQ 
MSLQKRSLIAEVLVKILKKCSV 


928 


2278 


A 


8004 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAG 
ARGLRATYHRLLDKVELMLPEKLRPLYNHPA 
GPRTVFFWAPIMKWGLVCAGLADMARPAEK 
LSTAQSAVLMATGFIWSRYSLVIIPKNWSLFA 
VNFFVGAAGASQLFRJWRYNQELKAKAHK 


929 


2279 


A 


8007 


2 


1016 


EF ARRR VF1AAREM SLLRSLR VFL V ARTG S YP 

AGSLLRQSPQPRHTFYAGPRLSASASSKELLM 

KLRRKTGYSFVNCKKALETCGGDLKQAEIWL 

HKEAQKEGWSKAAKLQGRKTKEGLIGLLQE 

GNTTVLVEVNCETDFVSRNLKFQLLVQQVAL 

GTMMHCQTLKDQPSAYSKGFLNSSELSGLPA 

GPDREGSLKDQLALAIGKLGENMILKRAAWV 

KVPSGFYVGSYVHGAMQSPSLHKXVLGKYG 

ALVICETSEQKTNLEDVGRRLGQHWGMAPL 

S VGSLDDEPG GEAKTKMLSQPYLLDPSITLGQ 

YVQPQGVSWDFVRFECGEGEEAAETE 


930 


2280 


A 


8008 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSFCEL 

GLVPLTDDTSHAGPPGPGRALLECDHLRSGV 

PGGRRRKDWSCSLLVASLAGAFGSSFLYGYN 

LSVVNAPTPYIKAFYNESWERRHGRPIDPDTL 

TLLWSVTVSIFAIGGLVGTXIVKMIGKVLGRK 

HTLLANNGFAISAALLMACSLQAGAFEMLIV 

GRFIMGIDGGVALS VLPMYLSEI SPKEIRGSLG 

QVTAIFICIGVFTGQLLGLPELLGKESTWPYLF 

GVIWPAWQLLSLPFLPDSPRYLLLEKHNEA 

RAVKAFQTFLGKADVSQEVEEVLAESRVQRS 

IRLVSVLELLRAPYVRWQVVTVIYTMACYQL 

CGLNAIWFYTNSIFGKAGIPPAKJPYVTLSTGG 

GTLTITLTLQDHAPWVPYLSIVGILAIIASFCSG 
PGGIPFILTGEFFQQSQRPAAFI1AGTVNWLSN 
FAVGLLFPF1QKSLDTYCFLVFATICITGAIYL , 
YF VLPETKNRT YAEI SQAFSKRNKA YPPEEKI 
DSAVTDGKINGRP 


931 


2281 


A 


8009 


861 


300 


AAGAWSAMPKAKGKTRRQKFGYSVNRKRL 
NRNARRKAAPR2ECSHIRHAWDHAKSVRQNL 
AEMGLAVDPNRAVPLRKRKVKAMEVDIEER 
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amino acid 

residue of 
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Amino acid sequence (A= Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F— Phenylalanine, (Xilycine, FHHistidine, 
I=Isoieucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X= Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possib/e 
nucleotide insertion 














PKEL VRKPY VLNDLE AEAS LPEKJECGNTL SRD 
LIDYVRYMVENHGEDYKAMARDEKNYYQD 
TPKQIRS KIN VYKRFYPAE WQDFLDSLQKRK 
MEVE 


932 


2282 


A 


8011 


412 


1 


SNLCLGNSWRWRWAKSRHHCIPTVTLSKRSG 
DIRGSHFSSPQRQRSQRVPGKETARVLRAGK 
QGRGQIPIPCPWPPPPPPPPPGSPGPGCRQFHQ 
SLEAKARHPASVREMRGKVKMRRALRRAPA 
STRAS SRQPNPK 


933 


2283 


A 


8012 


147 


1077 


PPVPPASRSDMAQNLKDLAGRLPAGPRGMGT 

ALKLLLGAGAVAYGVRESVFI'VEGGHRAIFF 

NRIGGVQQDTILAEGJLHFR1PWFQYPIIYDIRA 

RPRKISSPTGSKDLQMVNISLRVLSRPNAQEL 

PSMYQRLGLDYEERVLPSIVNEVLKSWAKF 

NASQLITQRAQVSLLIRRELTERAKDFSLILDD 

VAITELSFSREYTAAVEAKQVAQQEAQRAQF 

LVEKAKQEQRQKJVQAEGEAEAAKMLGEAL 

SKNPGY1KLRKIRAAQNISKTIATSQNRIYLTA 

DNLVLNLQDESFTRGSDSLIKGKK 


934 


2284 


A 


8023 


255. 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQ 
RLRKFRELHLMRNEARKLKHQEVVEEDKRL 
KLPAN WEAKfCARLE WELKEEEKKKEC AARG 
ED YEK VKJLLEI S AEDAER WERKiCKRKHPDLG 
FSDYAAAQLRQYHRLTKQIKPDMETYERLRE 
KHGEEFFPTSNSLLHGTHVPSTEEIDRMVIDLE 
KQIEKRDKY SRRRPYNDDADIDYINERNAKF 
NKKAERFYGKYTAEIKQNLERGTAV 


935 


2285 


A 


8027 


59 


310 


LVSSTVNLLTEKAPWNSLAWTVTSYVFLKFL 
QGGGTGSTGMRDSALTLLGIGPSHRHSLSIRL 
SQHSSPAPMYSQTFHILVLG 


936 


2286 


A 


8032 


1 


639 


SGRECNMAKTYDYLFKLLUGDSGVGKTCVL 

FRFSEDAFNSTFISTIGIDFKIRTIEI -DGKRJKLQ 

IWDTAGQERFRTITTAYYRGAMGIMLVYDIT 

NEKSFDNIRNWIRNIEEHASADVEKMILGNKC 

DVNDKRQVSKERGEKLALDYGIKFMETSAK 

ANIN VEN AFFTL ARDIKAKMDKKLEGN SPQG 

SNQGVKITPDQQKRSSFFRCVLL 


937 


2287 


A 


8039 


393 


311 


EETIHSENS YILEKYIPI SANLTLTIA 


938 


2288 


A 


8052 


675 


1334 


LHPAATST A WLH VPPGLSMALS WVLTVLSLL 

PLLEAQIPLCANLVPVPITNATLDRITGKWFYI 

ASAFRNEEYNKSVQEIQATFFYFTPNKTEDT1F 

LREYQTRQDQCIYNTTYLNVQRENGTISRYV 

GGQEHFAHLLILRDTKTYMLAFDVNDEKNW 

GLSVYADKPETTKEQLGEFYEALDCLRIPKSD 

VVYTD WKKDKCEPLEKQHEKERKQEEGES 


939 


2289 


A 


8055 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDF 

AEQLKWSAELARLGESIMDGKQGGMDGSKP 

AGPRDFPGIRLLSNPLMGDAVSDWSPMHEAA 

IHGHQLSLRNLISQGWAVNIITADHVSPLHEA 

CLGGHLSCVKILLKHGAQVNGVTADWHTPL 

FNACVSGSWDCVNLLLQHGASVQPESDLASP 

IHEAARRGHVECVNSLIAYGGNIDHKISHLGT 

PLYLACENQQRACVKKLLESGADVNQGKGQ 

DSPLHAVARTASEELACLLMDFGADTQAKN 

AEGKRP VEL VPPESPL AQLFLEREGPPSLMQL 

CRLR1RKCFGIQQHHK1TKLVLPEDLKQFLLH 

L 


940 


2290 


A 


8058 


2 


1203 


KVLSIREPAHSTARKASEPSQPSQPSQPGGHLI 
ARLRTMDLHLFDYSEPGNFSDISWPCNSSDCl 
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Amino acid sequence (A=AIanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GKjIycine, H=Histidine, 
I-Isoleucine, KHLysine, L=Leucine, 
M=Methionine, N=Asparagine, P= Pro line, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *-Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














VVDTVMCPNMPNKSVLLYTLSFIYIFIFVIGMI 

ANSWVWVNIQAKTTGYDTHCYILNLAIADL 

WWLTIPVWVVSLVQHNQWPMGELTCKVTH 

LIFSINLFGSIFFLTCMSVDRYLSITYFTNTPSS 

RKKM VRR WCILV WLLAFC V SLPDTY YLKT 

VTSASNNETYCRSFYPEHSIKEWLIGMELVSV 

VLGFAVPFSIIAVFYFLLARAISASSDQEKHSS 

RKIIFSYVVVFLVCWLPYHVAVLLDIFSE.HYI 

PFTCRLEHALFTALHVTQCLSLVHCCVNPVL 

YSFINRNYRYELMKAFIFKYSAKTGLTKLIDA 

SRVSETEYSALEQSTK 


941 


2291 


A 


8059 


73 


432 


DMAGLMTIVTSLLFLGVCAHHIIPTGSVVLPS 
PCCMFFVSKRIPENRWSYQLSSRSTCLKAGV 
IFTTKKGQQFCGDPKQEWVQRYMKNLDAKQ 
KKASPRARAVAVKGPVQRYPGNQTTC 


942 


2292 


A 


8067 


278 


1262 


GGIGEIKQRPSCLGRCLDPSLSVLMNISLGLGS 

VFSAVISQKPSRDICQRGTSLTIQCQVDSQVT 

MMFWYRQQPGQSLTLIATANQGSEATYESGF 

VIDKFPISRPNLTFSTLTVSNMSPEDSSIYLCSA 

GRQGTYEQYFGPGTRLTVTEDLKNVFPPEVA 

VFEPSEAEISHTQKATLVCLATGF YPDHVEL S 

WWVNGKEVHSGVSTDPQPLKEQPALNDSRY 

CLSSRLRVSATFWQNPRNHFRCQVQFYGLSE 

NDEWTQDRAKPVTQIVSAEAWGRADCGFTS 

ESYQQGVLSAT1LYEILLGKATLYAVLVSALV 

LMAMVKRKDSRG 


943 


2293 


A 


8070 


1 


879 


M VK V VPATRGNLPRSQLTGTHQH CQPREPKI 

TASERLRRRPRATARLRAHAAPPEPPLAVFAP 

PSDRKELLALPVACDPVIASVMSWVQAASLI 

QGPGDKGDVFDEEADESLLAQREWQSNMQR 

RVKEGYRDGIDAGKAVTLQQGFNQGYKKGA 

EVILNYGRLRGTLSALLSWCHLHNNNSTLINK 

INN1XDAVGQCEEYVLKHLKSITPPSHWDLL 

D SIEDMDLCH V VP AEKK IDEAKDERLCENN A 

EFHKNCSKSHSGIDCSYVECCRTQEHAHSGK 

PKPHMDFGTDSQF 


944 


2294 


A 


8073 


1 


797 


ESARWSRQLRRTLIRLSFPISCGRSHAFGGCK 

MAATSGTDEPVSGELVSVAHALSLPAESYGN 

DPDIEMA WAMRAMQHAEVYYKLI SSVDPQF 

LKLTKVDDQIYSEFRKNFb"l'LRIDVLDPEELK 

SESAKEKWRPFCLKFNGIVEDFNYGTLLRLD 

CSQGYTEENTIFAPR1QFFAIEIARNREGYNKA 

VYISVQDKEGEKGVNNGGEKRADSGEEENT 

KNGGEKGADSGEEKEEGINREDKTDKGGEK 

GKEADKEINKSGEKAM 


945 


2295 


A 


8074 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYL 

SADRRVLGLREWGRPASERECSLCQRLKREL 

NMGDVEKGKKIFIMKCSQCHTVEKGGKHKT 

GPNLHGLFGRKTGQAPGYSYTAANKNKGIIW 

GEDTLMEYLENPKKYIPGTKMIFVGIKKKEER 

ADL1AYLKKATNE 


946 


2296 


A 


8081 


42 


590 


EGRRGKFGGKXCNFLFYFHSNSAESRMDVLF 

VAIFAVPLILGQEYEDEERLGEDEYYQWYY 

YTVTPSYDDFSADFTIDYSIFESEDRLNRLDK 

DITEAIETTISLETARADHPKPVTVKPVTTEPQ 

SPRSEAMPCPVLRSPIPLPPVRVPLFRWGCISC 

KKVGRRLLMTLWMGVWQEEIGR 


947 


2297 


A 


8084 


322 


549 


GGGSSPRELAGAAGLTVTSQAVAARRQQPSF 
SRARAPAHSLRAALSLASSARSWGAVSRDRG 
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location 
corresponding 
to last amino 
acid residue 
of peptide 
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Amino acid sequence (A=AIanine C=Cysteine, 

U— ASpaTTlC AC1C1, Er-OlUiamiC /VCIQ, 

r — rn enyiai an in c, \j = \jiy\riiic ) n jiouuiJic., 
I=Isoleucine, K^Lysine, L«=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamuie, R^Arginine, S=Serme, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X— Unknown, *-Stop codon, 
A^ssible nucleotide deletion, \=possible 
nucleotide insertion 














PCPPAIMYQSSNKC 


948 


2298 


B 


8093 


3905 


846 


MEPGEVKDRILENISLSVKKLQSYFAACEDEI 

PAIRNHDKVLQRLCEHLDHALLYGL QDLS SG 

YWVLVVHFTRREAIKQIEVLQHVATNLGRSR 

AWLYLALNENSLESYLRLFQENLGLLHKYYV 

KNALVCSHDHLTLFLTLVSGLEFIRFELDLDA 

PYLDLAPYMPDYYKPQYLLDFEDRLPSSVHG 

SDSLSLNSFNSVTSTNLEWDDSAIAPSSEDYD 

FGDVFPAVPSVPSTDWEDGDLTDTVSGPRST 

ASDLTSSKASTRSPTQRQNPFNEEPAETVSSS 

DTTPVHTTSQEKEEAQALDPPDACTELEVIRV 

TKKKKIGKKKKSRSDEEASPLHPACSQKKCA 

KQGDGDSRNGSPSLGRDSPDTMLASPQEEGE 

GPSSTTESSERSEPGLLIPEMKDTSMERLGQPL 

SKVIDQLNGQLDPSTWCSRAEPPDQSFRTGSP 

GDAPERPPLCDFSEGLSAPMDFYRFTVESPST 

VTSGGGHHDPAGLGQPLHVPSSPEAAGQEEE 

GGGGEGQTPRPLEDTTREAQELEAQLSLVRE 

GPVSEPEPGTQEVLCQLKRDQPSPCLSSAEDS 

GVDEGQGSPSEMVHSSEFRVDNNHLLLLMIH 

VFRENEEQLFKMIRMSTGHMEGNLQLLYVLL 

TDCYVYLLRKGATEKPYLVEEAVSYNELDY 

V S VGLDQQT VKL VCTNRRKQFLLDTA DV AL 

AEFFLASLKSAMIKGCREPPYPSILTDATMEK 

L AL AKFV AQESKCE A S A VT VRF YGL VH WED 

PTDESLGPTPCHCSrrEO 1 1 1 JvtGMLH YKACj 1 

SYLGKEHWKTCFWLSNGILYQYPDRTDV1P 

LLSVNMGGEQCGGCRRANTTDRPHAFQVILS 

DPPCLELSAESEAEMAEWMQHLCQAVSKGVI 

PQGVAPSPCIPCCLVLTDDRLFTCHEDCQTSF 

FRSLGTAKLGDISAVSTEPGKEYCVLEFSQDS 

QQLLPPWVIYLSCTSELDRJLLSALNSGWKTIY 

QVDLPHTAIQEASNKKKFEDALSLIHSAWQR 

SDSLCRGRASRDPWC* 


949 


2299 


A 


8095 


9 


2374 


ARRADTVLLESPSMLQGLLPVSLLLSVAVSAI 

KELPGVKKYEVVYPIRLHPLHKREAKEPEQQ 

EQFETELKYKMTINGKIAVLYLKKNKNLLAP 

GYTETYYNSTGKEITTSPQIMDDCYYQGHILN 

EKVSDASISTCRGLRGYFSQGDQRYFIEPLSPI 

HRDGQEHALFKYNPDEKNYDSTCGMDGVL 

WAHDLQQNIALPATKLVKLKDRKVQEHEKY 

IEYYLVLDNGEFKRYNENQDE1RKRVFEMAN 

YVNMLYKKLNTHVALVGMEIWTDKDKIKIT 

PNASFTLENFSKWRGSVLSRRKRHDIAQLITA 

TELAGTTVGLAFMSTMCSPYSVGVVQDHSD 

NLLRVAGTMAHEMGHNFGMFHDDYSCKCPS 

TICVMDKALSFYIPTDFSSCSRLSYDKFFEDKL 

SNCLFNAPLPTDIISTPICGNQLVEMGEDCDC 

GTSEECTOICCDAKTCKIKATFQCALGECCEK 

CQFKKAGMVCRPAKDECDLPEMCNGKSGNC 

PDDRFQVNGFPCHHGKGHCLMGTCPTLQEQ 

1 ULi W VJx vj 1LV nUN.OL I l>rVJN.E,VJvJOi\. I VJ I V^,I\. 

RVDDTLIPCKANDTMCGKLFCQGGSDNLPW 

KGRIVTFLTCKTFDPEDTSQEIGMVANGTKCG 

DNKVCINAECVDIEKAYKSTNCSSKCKGHAV 

CDHELQCQCEEGWIPPDCDDSSWFHFSIVVG 

VLFPMAVIFVWAMVIRHQSSREKQKKDQRP 

LSTTGTRPHKQKRKPQMVKAVQPQEMSQMK 

PHVYDLPVEGNEPPASFHKDTNALPPTVFKD 

NPM STPKD SNPKA 
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>vijiiiiu add M^ijuciicc ^/\— /\ianinc \* — ^ysicinc, 
D-Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=Glycine, H-Histidine, 
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T=Threonine, V=VaIine, W=Tryptophan, 
Y^Tyrosine, X= Unknown, *~Stop codon, 
/=possible nucleotide deletion, V=possibIe 
nucleotide insertion 


950 


2300 

- 


A 


8100 


1 


1251 


MGLLLMILAS AVLG SFLTLL AQFFLL YRRQPE 
PPADEAARAGEGFRYIKPVPGLLLREYLYGG 
GRDEEPSGAAPEGGATPTAAPETPAPPTRETC 
YFLNATILFLFRELRDTALTRRWVTKKIKVEF 

RLVRPWPSATGEPDGPEGEALPAACPEELAF 

EAEVEYNGGFHLAIDVDLVFGKSAYLFVKLS 

RWGRLRLVFTRVPFTHWFFSFVEDPLIDFEV 

RSQFEGRPMPQLTSIIVNQLKKIIKRKHTLPNY 

KIRFKPFFPYQTLQGFEEDEEHIHIQQWALTE 

GRLKVTLLECSRLLIFGSYDREANVHCTLELS 

SSVWEEKQRSSDCTGTISLTAVFMGWHRVSE 

AFPGLWYKLLVDLPFWGLEDGGPLLTVPLRQ 

CPG 
vrvj 


951 


2301 


A 


8108 


1612 


839 


EVALFCFEMAAGMYLEHYLDSIENLPFELQR 

NFQLMRDLDQRTEDLKAEIDKLATEYMSSAR 

SLSSEEKLALLKQIQEAYGKCKEFGDDKVQL 

A\/fPlTVPX/f\/rMfIJT1>Dl nTF»T A T> 17C A T\I vcvm 

V 1 1 r>M V Uivfi 1 KJvLU 1 ULAKr IvYDLK fcJvQ I 

ESSDYDSSSSKGKKXGRTQKEKXAARARSKG 

KNSDEEAPKTAQKKLKLVRTSPEYGMPSVTF 

GSVHPSDVLDMPVDPNEPTYCLCHQVSYGE 

MIGCDNPDCSIEWFHFACVGLTTKPRGKWFC 

PRCSQERKKK 


952 


2302 


A 


8112 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRL 
LHGTTLPGGNQRELARQKNMKKQSDSVKGK 

ANEKKEEPK 


953 


2303 


A 


8118 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGfCQPG 

LETNILKMTTPNKTPPGADPKQLERTGTVREI 

GSQAVWSLSSCKPGFGVDQLRDDNLETYWQ 

SDGSQPHLVNIQFRRKTTVKTLCIYADYKSDE 

SYTPSKISVRVGNNFHNLQEIRQLELVEPSGW 

IHVPLTDNHKKPTRTFMIQIAVLANHQNGRD 

THMRQIKIYTPVEESSIGKFPRCTTIDFMMYRS 

IR 


954 


2304 


A 


8133 


66 


IU u 


PPI PPPQPPMT T7QPPT7PI DCDrt)D r'PMD C"r>T7r» a 

ARAPSPPPPFEGAPGRAMVKVTFNSALAQKE 

AKKDEPKSGEEALIIPPDAVAVDCKDPDDVV 

PVGQRRAWCWCMCFGLAFMLAGVILGGAY 

LYKYFALQPDDVYYCGIKYIKDDVILNEPSAD 

APAAL YQTIEENIKI FEEEEVEFIS VP VPEFADS 

DPANIVHDFNKKLTAYLDLNLDKCYVIPLNT 

SIVMPPRNLLELLINIKAGTYLPQSYLIHEHMV 

ITDRIENIDHLGFFIYRLCHDKETYKLQRRETI 

KGIQKREASN CFAIRHFENKFA VETLIC S 


955 


2305 


A 


8143 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDV 
KKKDCEVTEEVANKVSCAMTDEICRLSVLVD 
i^i tonr jir jn rU v L.IV1 1 IvoiiLJNivHItUUiVlvj KJn L 

ADRCTDEVNALVLQTQQEIIENLKPLLPAGIQ 
DKLHTLIPCKKFDLSYNLNYHKLCSDFQEDIV 
FRFSLGWSSLVHRFLGPRNAQRVLLGLSEPIF 
QLPRSLASTPTAPTTPATPDNAS QEELMITLVT 
GLASVTSRTSMGIIIVGGVIWKTIGWKLLSVS 
LTM YG ALYLYERLS WTTHAKERAFKQQFVN 
YATEKLRMIVSSTSANCSHQVKQQIATTFARL 
CQQ VDITQKQLEEEIARLPKEIDQLEKIQNNS 
KLLRNKAVQLENELENFTKQFLPSSNEES 


956 


2306 


A 


8157 


1854 


798 


ASGSPAPSS SS AMAAACGPGAAG YCLLLGLH 
LFLLTAGPALGWNDPDRMLLRDVKALTLHY 
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Amino acid sequence (A~Alanine OCysteine, 
D^Aspartic Acid, E=Glutamic Acid, 
F=PhenylaIanine, G=Glycine, H=Histidine, 
I«=Isoleucine, K^ysine, L=Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=VaJine, W=Tryptophari, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion. V=possible 
nucleotide insertion 






1 

\ 








DRYTTSRRLDP1PQLKCVGGTAGCDSYTPKVI 

QCQNKGWDG YDVQWECK 1 DLD1 AYKF OK J 

VVSCEGYESSEDQYVLRGSCGLEYNLDYTEL 

GLQKLKESGKQHGFASFSDYYYKWSSADSC 

NMSGLITIVVLLGIAFWYKLFLSDGQYSPPP 

Y SE YPPF SHR YQRFTN S AGPPPPGFKSEFTGPQ 

NTGHGATSGFGSAFTGQQGYENSGPGFWTGL 

GTGGILGYLFGSNRAATPFSDSWYYPSYPPSY 

PGTWNRAYSPLHGGSGSYSVCSNSDTKTRTA 

SGYGGTRRR 


957 


2307 


A 

i 


8159 


1492 


528 


TH WMTGMC Y APHQ VLSYINGVTTSKPGV SL 

VYSMPSRNLSLRLEGLQEKDSGP Y SCS VNVQ 

DKQGKSRGHS1KTLELNVLVPPAPPSCRLQGV 

PHVGANVTLSCQSPRSKPAVQYQWDRQLPSF 

QTFFAPALDVIRGSLSLTNLSSSMAGVYVCKA 

HNEVGTAQCNVTLEVSTGPGAAVVAGAVVG 

TLVGLGLLAGLVLLYHRRGKALEEPANDIKE 

DAIAPRTLPWPKSSDTISKNGTLSSVTSARAL 

RPPHGPPRPGALTPTPSLSSQALPSPRJLPTTDG 

AHPQPISPIPGGVSSSGLSRMGAVPVMVPAQS 

QAGSLV 


958 


2308 


A 


8161 


2340 


1192 


ELARRPKQQSSEKSRNMIRNWLT1FILFPLKLV 

EKCESSVSLTVPPWKLENGSSTNVSLTLRPP 

LN ATLVITFE ITFRSKN1TILELPDEVWPPGVT 

NSSFQVTSQNVGQLTVYLHGNHSNQTGPRIR 

FLVIRSSAISUNQVIGWIYFVAWSISFYPQV1M 

NWRRKSV1GLSFDFVALNLTGFVAYSVFNIGL 

LWVPYIKEQFLLKYPNGVNPVNSNDVFFSLH 

AWLTLIIIVQCCLYERGGQRVSWPAIGFLVL 

AWLFAFVTMIVAAVGVITWLQB.FCFSYIKL 

AV1LVKYFPQAYMNFYYKSTEGWSIGNVLL 

DFTGGSFSLLQMFLQSYNNDQWTLIFGDPTK 

FGLGVFSIVFDWFFIQHFCLYRKRPGYDQLN 


959 


2309 


A 


8163 


521 


1345 


GERAGRRRGRLGVW AQPQPLLPRP V GbKKb 

MQPPGPPPA YAPTNGDF IF VSS AD AEDLSGS1 

ASPDVKLNLGGDFIKESTATTFLRQRGYGWL 

LEVEDDDPEDNKPLLEELDIDLKDIYYKIRCV 

LMPMPSLGFNRQVVRDNPDFWGPLAVVLFFS 

MISLYGQFRWSWHTIWIFGSLTIFLLARVLG 

GEVAYGQVLGVIGY SLLPLIVIAPVLL WGSF 

EWSTLIKLFGVFWAAYSAASLLVGEEFKTK 

KPLL1YPBFLLYIYFLSLYTGV 


960 


2310 


A 


8167 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPN 

LYSQLNALQFTVDERSILWLNQFLLDLKQSL 

NQFMAVY KLNDN SKSDEHV D VRVDGLMLK 

FVIPSEVKSECHQDQPRAISIQSSEMIATNTRH 

CPNCRHSDLEALFQDFKDCDFFSKTYTSFPKS 

CDNFNLLHPIFQRHAHEQDTKMHE1YKGNITP 

QLNKNTLKTSAATDVWAVYFSQFWIDYEGM 

KSGKGRPISFVDSFTLSIWICQPTRYAESQKEP 

QTCNQVSLNTSQSESSDLAGRLKRKKLLKEY 

VOXT7CT7PT TMnrw^VPQQ QnTFFP F^PS99F AD1 

HLLVHVHKHVSMQINHYQYLLLLFLHESL1LL 

SENLRKDVEAVTGSPASQTSICIGILLRSAELA 

LLLHPVDQANTLKSPVSESVSPWPDYLPTEN 

GDFLSSKRKQISRDINR1RSVTVNHMSDNRSM 

SVDLSHIPLKDPLLFKSASDTNLQKGI SFMDY 

LSDKHLGKISEDESSGLVYKSGSGEIGSETSD 

KXDSFYTDSSSVLNYREDSNILSFDSDGNQNI 

LSSTLTSKGNETIESIFKAEDLLPEAASLSENL 
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Y=Ty rosin e, X=Unknown t *=Stop codon, 
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DISKEETPPVRTLKSQSSLSGKPKERCPPNLAP 

LCVSYKNMKRSSSQMSLDTISLDSMILEEQLL 

ESDGSDSHK^EKGNKKNSTTNYRGTAESVN 

AGANLQNYGETSPDAISTNSEGAQENHDDLM 

S V V VFKITG VN GEIDIRGEDTEICLQVNQVTP 

DQLGNISLRHYLCNRPVGSDQKAVIHSKSSPE 

ISLRFESGPGAVIHSLLAEKNGFLQCHIENFST 

EFLTSSLMNIQHFLEDETVATVMPMKIQVSNT 

K1NLKDDSPRSST V SLEPAPVTVHIDHL WER 

SDDGSFHIRDSHMLNTGNDLKENVKSDSVLL 

TSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMAL 

AEAHLEKDALLHHIKKMTVE 


961 


2311 


A 


8172 


1442 


682 


TAAMSIFTPTNQIRLTNVAVVUMKRAGKRFEI 

ACYKNKWGWRSGVEKDLDEVLQTHSVFVN 

VSKGQVAKKEDLISAFGTDDQTEICKQILTKG 

EVQVSDKERHTQLEQMFRDIATIVADKCVNP 

ETKRPYTVILIERAMKDIHYSVKTNKSTKQQA 

LEVIKQLKEKMKIERAHMRLRFILPVNEGKKL 

KEKLKPLIKVIESEDYGQQLEIVCLIDPGCFREI 

DELIKKETKGKGSLEVLN LKDVEEGDEKFE 


962 


2312 


A 


8175 


286 


587 


NISNKAEVSSHPSV1SHSMDSFGQPRPEDNQS 
VLRRMQKKYWKTKQVnKATGKKEDEHLVA 
SDAELDAKLEVFHSVQETCTELLKIIEKYQLR 
LNGMKS 


963 


2313 


A 


8181 


13 


2215 


AEGCAERRGTEPVVELSVLSWESGAGPGLGSQ 

GMDLVWSAWYGKCVKGKGSLPLSAHGIW 

AWLSRAEWDQVTVYLFCDDHKLQRYALNRI 

TVWRSRSGNELPLAVASTADLIRCKLLDVTG 

GLGTDELRLLYGMALVRFVNLISERKTKFAK 

VPLKCLAQEVNIPD WIVDLRHELTHKKMPHI 

NDCRRGCYFVLDWLQKTYWCRQLENSLRET 

WELEEFREGIEEEDQEEDKN1WDDITEQKPE 

PQDDGKSTESDVKADGDSKGSEEVDSHCKK 

ALSHKELYERARELLVSYEEEQFTVLEKFRYL 

PKAIKAWNNPS PR VEC VLAELKG VTCEN REA 

VLDAFLDDGFLVPTFEQLAALQIEYEENVDL 

NDVLVPKPFSQFWQPLLRGLHSQNFTQALLE 

RMLSELPALGISGIRPTYILRWTVELIVANTKT 

GRNARRFSAGQWEARRGWRLFNCSASLDWP 

RMVESCLGSPCWASPQLLRIIFKAMGQGLPD 

EEQEKLLRICSIYTQSGENSLVQEGSEASPIGK 

SPYTLDSLYWSVKPASSSFGSEAKAQQQEEQ 

GSVNDVKEEEKEEKEVLPDQVEEEEENDDQE 

EEEEDEDDEDDEEEDRMEVGPFSTGQESPTA 

ENARLLAQKRGALQGSAWQVSSEDVRWDTF 

PLGRMPGQTEDPAELMLENYDTMYLLDQPV 

LEQRLEPSTCKTDTLGLSCGVGSGNCSNSSSS 

NFEGLLWSQGQLHGLKTGLQLF 


964 


2314 


A 


8184 


6 


1393 


EPRRNFRDDSTRPRTRGRTRGRRRRACRSAE 

GTGLRSLLLPPRLQLPAGPFSRCRWDPVSSFR 

PSTMPPKKGGDGIKPPPIIGRFGTSLKIGIVGLP 

NVGKSTFFNVLTNSQASAENFPFCTIDPNESR 

VPVPDERFDFLCQYHKPASKIPAFLNVVDIAG 

LVKGAHNGQGLGNAFLSHISACDG1FHLTRA 

FEDDDITHVEGSVDPIRDIEirHEELQLKDEEMI 

GPHDKLEKVAVRGGDKKLKPEYDIMCKVKS 

WVIDQKXPVRFYHDWNDKEIEVLNKHLFLTS 

KPMVYLVN^SEKDYIRKKNKWLIKIKEWVD 

KYDPGALVIPFSGALELKLQELSAEERQKYLE 
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Amino acid sequence (A— Alanine C=Cysteine, 
D=Aspartic Acid, EKjlutamic Acid, 
F^Phenylalanine, G^GIycine, H^Histidine, 
I^Isoleucine, K=Lysine, L-Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^lutamine, R=Argjnine, S=Serine, 
T^Threonine, V=V aline, W-Tryptophan, 
Y=Tyrosine 3 X=Unknown, *=Stop codon, 
y=possible nucleotide deletion, \=possibIe 
nucleotide insertion 














ANMTQSALPKHKAGFAALQLEYFFTAGPDEV 
RAWTIRKGTKAPQAAGKIHTDFEKGFIMAEV 
MKYEDFKEEGSENAVKAAGKYRQQGRNYIV 
EDGD1 IFFKFNTPQQPKKK 


965 


2315 


A 


8195 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMV 

AAALGGHPLLGVSATLNSVLNSNAIKNLPPPL 

GGAAGHPGSAVSAAPGILYPGGNKYQTIDNY 

QPYPCAEDEECGTDEYCASPTRGGDAGVQIC 

LACRKRRKRCMUHAMCCPGNYCKNGICVSS 

DQNHFRGEIEETITESFGNDHSTLDG YSRRTT 

LSSKMYHTKGQEGSVCLRSSDCASGLCCARH 

FWSKICKPVLKEGQVCTKHRRK.GSHGLEIFQ 

RCYCGEGLSCRIQKDHHQASNSSRLHTCQRH 


966 


2316 


A 


8207 


416 


4082 


KFKLIKIMLLTLnLLPVVSKFSFVSLSAPQHW 

SCPEGTLAGNGNSTCVGPAPFUFSHGNSIFRI 

DTEGTNYEQLVVDAGVSVIMDFHYNEKRIY 

WVDLERQLLQRVFLNGSRQERVCNIEKNVSG 

MAINWINEEVIWSNQQEGIITVTDMKGNNSHI 

LLSALKYPANVAVDPVERFIFWSSEVAGSLY 

RADLDGVGVKALLETSEKITAVSLDVLDKRL 

FWIQYNREGSNSLICSCDYDGGSVHISKHPTQ 

HNLFAMSLFGDRTF Y ST WKMKTI WIANKHTG 

KDMVRINLHSSFVPLGELKVVHPLAQPKAED 

DTWEPEQKLCKLRKGNCSSTVCGQDLQSHLC 

MCAEGYALSRDRKYCEGNDWKYCEDVNEC 

AFWNHGCTLGCKNTPGSYYCTCPVGFVLLPD 

GKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPL 

SPVSWECDCFPGYDLQLDEKSCAASGPQPFL 

LFANSQDIRHMHFDGTDYGTLLSQQMGMVY 

ALDHDPVENKIYFAHTALKWIERANMDGSQ 

RERLIEEGVD VPEGL A VD WIGRRF Y WTDR GK 

SLIGRSDLNGICRSKJITIENISQPRGIAVHPMAK 

RLFWTDTGINPRXESSSLQGLGRLVIASSDLIW 

PSGITIDFLTDKLYWCDAKQSViElVIANLDGSK 

RRRLTQNDVGHPFAVAVFEDYVWFSDWAMP 

SVIRVNKRTGKDRVRLQGSMLKPSSLWVHP 

LAKPGADPCLYQNGGCEHICKKRLGTAWCS 

CREGFMKASDGKTCLALDGHQLLAGGEVDL 

KNQVTPLDILSKTRVSEDNITESQHMLVAEIM 

VSDQDDCAPVGCSMYARCISEGEDATCQCLK 

GFAGDGKLCSDIDECEMGVPVCPPASSKCINT 

EGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTOTEGGYTCMCAGRLSEPGLICPD 

STPPPHLREDDHHYSVRNSDSECPLSHDGYCL 

HDGVCMYIEALDKYACNCVVGYIGERCQYR 

DLKWWELRHAGHGQQQKVIWAVCWVLV 

MLLLLSLWGAHYYRTQKLLSKNPKNPYEESS 

RD VRSRRPADTEDGMSSCPQPWFVVIKEHQD 

LKNGGQP V AG EDG Q A ADG SMQPTS WRQEPQ 

LCGMGTEQGCW IPVS SDKGSCPQVMERSFH 

MPSYGTQTLEGGVEKPHSLLSANPLWQQRAL 

DPPHQMELTQ 


967 


2317 


A 


8210 


3 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSQASLL 

RLHHRFRALDRNKXGYLSRMDLQQIGALAV 

NPLGDRJIESFFPDGSQRVDFPGFVRVLAHFRP 

VEDEDTETQDPKKPEPLNSRRNKLHYAFQLY 

DLDRDGKISRHEMLQVLRLMVGVQVTEEQL 

ENLADRTVQEADEDGDGAVSFVEFTKSLEKM 

DVEHKMSIRJLK 
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eotide 
seq- 

ucncc 


NO: of 
peptide 
seq- 
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Met 
hod 


ID NO: 
in 
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rTcQlCtcQ 

beginning 
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no try firot 
Hg IVJ UlDL 

amino acid 
residue of 
peptide 
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rredictea end 
nucleotide 
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to last amino 

of peptide 
sequence 


Amino acia sequence \ /uanine — ^ysieine, 
D=Aspartic Acid, E*=Glutaraic Acid, 
F-Fhenylalanine, GKjIycine, H^Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M~Methionine, N-Asparagine, P^ProIine, 

t^aniiitaTriirif* R— Arcitiinp ^~^prin^ 

T=Threonine, V=Valine, W=Tryptophan. 
Y-Tyrosine, X=Unknown. *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


968 


2318 


A 


8211 


2 


409 


ISSCPHTAYEGSMSTLSNFTQTLEDVFRRJFIT 
YMDN WRQNTTAEQEALQAK VDAENFYY V1L 
YLMV.MIGMFSFUVAILVSTVKSKRREHSNDP 
YHQYIVEDWQEKYKSQILNLEESKATIHENIG 
AAGFKMSP 


969 


2319 


A 


8215 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWR 

VPGRLLLLLLPALCCLPGAARAAAAAAGAGN 

RAAVAVAVARADEAEAPFAGQNWLKSYGY 

LLPYDSRASALHSAKALQSAVSTMQQFYGIP 

VTGVLDQTTIEWMKKPRCGVPDHPHLSRRRR 

NKRYALTGQKWRQKHITYSIHNYTPKVGELD 

TRKAIRQAFDVWQKVTPLTFEEVPYHEIKSDR 

KEADIMIFFASGFHGDSSPFDGEGGFLAHAYF 

PGPGIGGDTHFDSDEPWTLGNANHDGNDLFL 

VAVHELGHALGLEHSSDPSAIMAPFYQYMET 

HKFKLPQDDLQG1QKIYGPPAEPLEPTRPLPTL 

r VRKlHbrsbKlUlbKyrKxPRrrLODKr I r(j I 

KPNICD GNFNTVALFRGEMF VFKDRWF WRL 

RNNRVQEGYPMQIEQFWKGLPARIDAAYER 

ADGRFVFFKGDK YW VFKEVTVEPG YPH SLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERY 

WRYSEERRATDPGYPKPITVWKGIPQAPQGA 

F1SKEGYYTYFYKGRDYWKFDNQKLSVEPGY 

PRNILRDWMGCNQKEVERRKERRLPQDDVDI 

MVTINDVPGSVNAVAWIPCILSLCILVLVYTI 

FQFKNKTGPQPVTYYKRPVQEWV 


970 


2320 


A 


8216 


1235 


2223 


SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMN 

nci 'DTxn/n/o cadcti a \ nvi a ad at atdt r> 
L>oLK i M Vr V KrV^Jnb I lAL-AL-l I LAAKALVirLr 

TRPH WFLLFGTTEEEIQEICIETLRL YTRKKPN 

YELLEKEVEKRKV ALQEAKLKAKG LNPDGTP 

ALSTLGGFSPASKPSSPREVBCAEEKSPISINVK 

TVKKEPEDRQQASKSPYNGVRKDSKRSRNSR 

SASRSRSRTRSRSRSHTPRRHYNNRRSRSGTY 

SSRSRSRSRSHSESPRRHHNHGSPHLKAKHTR 

DDLKSSNRHGHKRKKSRSRSQSKSRDHSDAA 

KKHRHERGHHRDRRERSRSFERSHKSKHHGG 

SRSGHGRHRR 


971 


2321 


A 


8217 


3 


3274 


DCRLQAAMPTNFTVVPVEAHADGGGDETAE 

RTEAPGTPEGPEPERPSPGDGNPRENSPFLNN 

VEVEQESFFEGKNMALFEEEMDSNPMVSSLL 

NKLANYTNLSQGWEHEEDEESRRREAKAPR 

MGTFIGVYLPCLQNILGVILFLRLTWIVGVAG 

VLESFLIVAMCCTCTMLTAISMSAIATNGVVP 

AGGSYYMISRSLGPEFGGAVGLCFYLGTTFA 

GAMYILGTIEIFLTYISPGAAIFOAEAAGGEAA 

AMLHNMRVYGTCTLVLMALVVFVGVKYVN 

KLALVFLACWLSILAIYAGV1KSAFDPPDIPV. 

CLLGNRTLSRRSFDACVKAYGIHNNSATSAL " 

WGLFCNGSQPSAACDEYFIQNNVTEIQGIPGA 

ASGVFLENLWSTYAHAGAFVEKKGVPSVPV 

API7QTJ A <sT7 PWT THIA AQT7TI I VflTVT'PQVTY^ 

IMAGSNRSGDLKDAQKSiPTGTILArVTTSFlY 

LSCIVLFGACIEGWLRDKFGEALQGNLVIGM 

LAWPSPWVIVIGSFFSTCGAGLQTLTGAPRLL 

QAIARDGIVPFLQVFGHGKANGEPTWALLLT 

VLICETG ILIASLDSVAPILSMFFLMCYLFVNL 

AC^VQTLLRTPNWRPRFKFYHWTLSFLGMSL 

CLALMFICSWYYALSAMLIAGCIYKYIEYRG 

AEKEWGDG1RGLSLNAARYALLRVEHGPPHT 

KNWRPQVLVMLNLDAEQAMKHPRLLSFTSQ 
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of peptide 
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Amino acid sequence (A=Alanine C=Cysteine, 
n=A«martic Acid E==Glutamic Acid, 
F^Phenylalanine, G=Grycine, H=Histidine, 
I=Isoleucine, K=Lysine. L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LBCAGKGLTTVGSVLEGTYLDKHMEAQRAEE 

NIRSLMSTEKTKGFCQLWSSSLRDGMSHLIQ 

SAGLGGLKHNTVLMAWPASWKQEDNPFSW 

KNFVDTVRDTTAAHQALLVAKNVDSFPQNQ 

ERFGGGHIDVWWIVHDGGMLMLLPFLLRQH 

K V WRKCRMRJFTV AQ VDDN SIQMKKDLQMF 

LYHLRJSAEVEWEMVENDISAFTYERTLMM 

EQRSQMLKQMQLSKNEQEREAQLIHDRNTAS 

HTAAAARTQAPPTPDKVQMTWTREKLIAEK 

YRSRDTSLSGFKDLFSMKPDQSNVRRMHTAV 

KLNGWLNKSQDAQLVLLNMPGPPKNRQGD 

ENYMEFLEVLTEGLNRVLLVRGGGREVITIYS 


972 


2322 


A 


8224 


701 


246 


TSRRVTMKFNPFVTSDRSKNRKRHFNAPSHV 
RRKIMSSPLSKELRQKYNVRSMPIRKDDEVQ 
VVRGHYKGQQIGKVVQVYRKKYVIYIERVQ 
REKANGTTVHVGIHPSKVVTTRLKLDKDRKKI 
LERKAKSRQVGKEK GK YKEELIEKMQE 


973 


2323 


A 

• 


8237 


873 


4610 


GCPHAGGKGRVPTGGLTGGRTWSPSAAPRSC 

PRPGPTPAPGAMDKLPPSMRKRLYSLPQQVG 

AKAWIMDEEEDAEEEGAGGRQDPSRRSIRLR 

PLPSPSPSAAAGGTESRSSALGAADSEGPARG 

AGKSSTNGDCRRFRGSLASLGSRGGGSGGTG 

SGSSHGHLHDSAEERRLIAEGDASPGEDRTPP 

GLAAEPERPGASAQPAASPPPPQQPPQPASAS 

CEQPSVDTAIKVEGGAAAGDQILPEAEVRLG 

QAGFMQRQFGAMLQPGVNKFSLRMFGSQKA 

VEREQERVKSAGFWIIHPYSDFRFYWDLTML 

LLMVGNLIIIPVGITFFKDENTTPWIVFNWSD 

TFFLIDLVLNFRTGIWEDNTEIILDPQR1KMK 

YLKSWFMVDFISSJPVDY1FLIVFTRTDSEVYK 

TARALRJVRFTKILSLLRLLRLSRLIRYIHQWE 

EIFHMTYDLASAWRIVNLIGMMLLLCHWDG 

CLQFLVPMLQDFPDDCWVSINNMVNNSWGK 

QYSYALFKAMS11MLCIGYGRQAPVGMSDV 

WLTMLSMI VGATCY AMFIGHATALIQSLDSS 

RRQYQEKYKQVEQ YMSFHKLPPDTRQRJH D 

YYEHRYQGKMFDEESILGELSEPLREEHNFNC 

RKLVASMPLFANADPNFVTSML'IIOJIFEVFQ 

PGDYIIREGTIGKJCMYFIQHGVVSVLTKGNKE 

TKLADGSYFGEICLLTRGRRTASVRADTYCR 

LYSLSVDNFNEVLEEYPMMRRAFETVALDRL 

DRIGKKNSILLHKVQHDLNSGVFNYQENEIIQ 

QIVQHDREMAHCAHRVQAAASATPTPTPVIW 

TPL1 Q APLQ AAAATT S V AI ALTHHPRLP AAI FR 

PPPGSGLGNLGAGQTPRKLKRLQSLIPSALGS 

ASPASSPSQVDTPSSSSFHIQQLAGFSAPAGLS 

PLLPSSSSSPPPGACGSPSAPTPSAGVAATT1A 

GFGHFHKALGGSLSSSDSPLLTPLQPGARSPQ 

AAOP^PAPPGARGGLGLPEHFLPPPPSSRSPSS 

SPGQLGQPPGELSLGLATGPLSTPETPPRQPEP 

PSLVAGASGGASPVGFTPRGGLSPPGHSPGPP 

RTFPSAPPRASGSHGSLLLPPASSPPPPQVPQR 

RGTPPLTPGRLTQDLKLISASQPALPQDGAQT 

LRRASPHSSGESMAAFPLFPRAGGGSGGSGSS 

GGLGPPGRPYGAIPGQHVTLPRKTSSGSLPPP 

LSLFGARATSSGGPPLTAGPQREPGARPEPVR 

SKLPSNL 


974 


2324 


A 


8247 


279 


468 


EYKQWERRFLSCQNRNDLGYGKPRKGGGLL 
LVPVKDASRICSLTYLLGSHWNNLVVRSPVL 
G 
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hod 


ID NO: 


beginning 


nucleotide 


D=Aspartic Acid, E=GlutamicAcid, 
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in 


nucleotide 


location 


F=Phenylalanine, G=GIycine, H-Histidine, 


eotide 


seq- 




USSN 
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corresponding 


I-lsoleucine, K=Lysine, L=Leucine, 


seq- 
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09/496 


correspond! 


to last amino 


M=Methionine, N^Asparagine, P=Proline, 


uence 






914 


ng to first 
amino acid 
residue of 
peptide 
sequence 


acid residue 
of peptide 
sequence 


Q=Glutamine, R=Arginine > S=Serine, 
T^Threonine. V=Valine, W=Tryptophan : 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/possible nucleotide deletion, V=possibIe 
nucleotide insertion 


975 


2325 


A 


8249 


62 


1571 


LVALKNWKPKGTNIPAPQSPVFGEAVSGVYM 

MTKVLGMAPVLGPRPPQEQVGPLMVKVEEK 

EEKGKYLPSLEMFRQRFRQFGYHDTPGPREA 

LSQLRVLCCEWLRPE1HTKEQILELLVLEQFLT 

ILPQELQAWVQEHCPESAEEAVTLLEDLEREL 

DEPGHQVSTPPNEQKPVWEKJSSSGTAKESPS 

SMQPQPLETSHKYESWGPLY1QESGEEQEFAQ 

DPRICVRDCRT STOHFESADEOKGSFAEfil KG 

DIISV11ANKPEASLERQCVNLENEKGTKPPLQ 

EAGSKKGRESVPTKPTPGERRYICAECGKAFS 

NSSNLTKHRRTHTGEKPYVCTKCGKAFSHSS 

NLTLHYRTHLVDRPYDCKCGKAFGQSSDLLK 

HQRMHTEEAPYQCKDCGKAFSGKGSLIRHYR 

IHTGEKPYQCNECGKSFSQHAGLSSHQRLHT 

GEKPYKCKECGKAFNHSSNFNKHHR1HTGEK 

PYWCHHCGKTFCSKSNLSKHQRVHTGEGEA 

P 


976 


2326 


A 


8257 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLE 

VA WPLFIFL1LIS VRL S YPPYEQHECHFPNKAM 

PSAGTLPWVQGIICNANNPCFRYPTPGEAPGV 

VGNFNKSIVARLFSDARRLLLYSQKDTSMKD 

MRK VLRTLQQIKKSS SNLKLQDFLVDNETFS 

GFLYHNLSLPKSTVDKMLRADVILHKVFLQG 

YQLHLTSLCNGSKSEEM1QLGDQEVSELCGLP 

REKLAAAERVLRSNMD1LKPILRTLNSTSPFPS 

KELAEATKTLLH SLGTLAQELFSMRS WSDMR 

QEVMFLTNVNSSSSSTQ1YQAVSRIVCGHPEG 

GGLKIKSLNWYEDNNYKALFGGNGTEEDAE 

TFYDNSTTPYCNDLMKNLESSPLSRI1WKALK 

PLLVGKILYTPDTPATRQVMAEVNKTFQELA 

WHDLEGMWEELSPK1WTFMENSQEMDLVR 

MLLDSRDNDHFWEQQLDGLDWTAQDIVAFL 

AKHPEDVQSSNGSVYTWREAFNETNQAIRTIS 

RFMECVmNKLEPIATEVWLINKSMELLDER 

KFWAGIVFTGITPGSIELPHHVKYKIRMGIDN 

VERTNKIKPGYWDPGPRADPFEDMRYVWGG 

FAYLQDWEQAIIRVLTGTEKKTGVYMQQMP 

YPCYVDDIFLRVMSRSMPLFMTLAW1YSVAV 

IIKGIVYFJCEARLKETMRJMGLDNSILWFSWFI 

SSLIPLLVSAGLLW1LKLGNLLPYSDPSWFV 

FLSVF A WTILQCFLI STLFSRANLAAACGGII 

YFTLYLPYVLCVAWQDYVGFTLKIFASLLSP 

VAFGFGCEYFALFEEQGIGVQWDNLFESPVE 

EDGFNLTTSVSMMLFDTFLYGVMTWYIEAVF 

PGQYGIPRPWYFPCTKSYWFGEESDEKSHPGS 

NQKRISEICMEEEPTHLKLGVSIQNLVKVYRD 

GMKVAVDGLALNFYEGQITSFLGHNGAGKT 

TTMSILTGLFPPTSGTAYILGKDIRSEMSTTRQ 

NLGVCPQHNVLFDMLTVEEHIWFYARLKGLS 

EKilVKAEMEQMALDVGLPSSKLKSKTSQLS 

GGMORKLSVALAFVGGSKVVII DEPTAGVTVP 

YSRRGIWELLLKYRQGRTIILSTHHlVCDEADVL 

GDRIAIISHGKLCCVG SSLFLKNQLGTGYYLT 

LVKXDVESSLSSCRNSSSTVSYLKKEDSVSQS 

SSDAGLG SOHESDTLTIDVS A1SNLIRKHVSEA 

RLVEDIGHELTYVLPYEAAKEGAFVELFHEID 

DRLSDLGISSYGISETTLEEIFLKVAEESGVDA 

ETSDGTLPARRNRRAFGDKQSCLRPFTEDDA 

ADPNDSDIDPESRKTDLLSGMDGKGSYQVKG 

WKLTQQQFVALLWKRLLIARRSRKGFFAQIV 
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peptide 
sequence 


Predicted end 

IlUvJCUUUC 

location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine O Cysteine, 
D=A<inartic Acid E=Glutamic Acid, 
F=Phenylalanine, G=GIycine, H=Histidine, 
I^Isoleucine, K=Lysine, L*=Leucine, 
M=Methionine, N=Asparagine, ^Proline, 
Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, 
Y=Tyrosine, X=Unknown : *=Stop codon, 
/^possible nucleotide deletion, V=possib1e 
nucleotide insertion 














LPAVFVCIALVFSLIVPPFGKYPSLELQPWMY 

NEQYTFVSNDAPEDTGTLELLNALTKX)PGFG 

TRCMEGNP1PDTPCQAGEEEWTTAPVPQT1M 

DLFQNGNWTMQNPSPACQCSSDKIKKMLPV 

CPPGAGGLPPPQRKQNTADILQDLTGRN1SDY 

LVKTYVQ1IAKSLKNKIWVNEFRYGGFSLGVS 

NTQALPPSQEVNDATKQMKKHLKLAJCDSSA 

DRFLNSLGRFMTGLD'mNNVKVWFNNKGW 

HAISSFLNVINNA1LRANLQKGENPSHYGITAF 

NHPLNLTKQQLSEVAPMTTSVDVLVSICVIFA 

MSFVPASFWFLIQERVSKAKHLQFISGVKPVI 

YWLSNFVWDMCNYVVPATLV111F1CFQQKSY 

VSSTNLPVLALLLLLYGWSITPLMYPASFVFK 

IPSTAYVVLTSVNLFIGINGSVATFVLELFTDN 

K1JWINDILKSVFUFPHFCLGRGLIDMVKNQ 

AMADALERFGENRFVSPLSWDLVGRNLFAM 

AVEGVVFFLITVLIQYRFFIRPRPVNAKLSPLN 

DEDEDVRRERQRILDGGGQNDILEIKELTKIY 

RRKJRKPAVDRICVGrPPGECFGLLGVNGAGK 

SSTFKMLTGDTTVTRGDAFLNRNSILSNIHEV 

HrfNFMnYrPOFDATTFLI TGREHVEFFALLRG 

VPEKEVGKVGEWAIRKLGLVKYGEKYAGNY 

SGGNKRKLSTAMALIGGPPWFLDEPTTGMD 

PKARRFLWNCALSWKEGRSWLTSHSMEEC 

EALCTRMAIMVNGRFRCLGSVQHLKNRFGD 

G YTI VVRJ AG SNPDLKP VQDFFGL AFPGS VPK 

EKHRNMLQYQLPSSLSSLARIFSILSQSKKRLH 

IEDYSVSQTTLDQVFVNFAKDQSDDDHLKDL 

SLHKNQTWDVAVLTSFLQDEKVKESYV 


977 


2327 


A 


8260 


3 


1567 


IPGSTISFSLCnFPPCVPTMVRKPWSTISKGG 

YLQGNVNGRLPSLGNKEPPGQEKVQLKRKV 

TLLRGVSIIIGTIIGAGIFISPKGVLQNTGSVGM 

SLTIWTVCGVLSLFGALSYAELGTTIKKSGGH 

YTYILEVFGPLPAFVRVWVELLIIRPAATAVIS 

LAFGRYILEPFFIQCEIPELAIKLITAVGITVVM 

VLNSMSVSWSAR1QIFLTFCKLTAILIIIVPGV 

MOT TKGOTONFKDAFSGRDSSITRLPLAFYYG 

MYAYAGWFYLNFVTEEVENPEKTIPLAICISM 

AIVTIGYVLTNVAYFTTINAEELLLSNAVAVT 

FSERLLGNFSLAVPIFVALSCFGSMNGGVFAV 

SRLFYVASREGHLPEILSM1HVRKHTPLPAVTV 

LHPLTM1MLFSGDLDSLLNFLSFARWLF1GLA 

VAGLIYLRYKCPDMHRPFKVPLFIPALFSFTC 

LFMVALSLYSDPFSTGIGFVITLTGVPAYYLFn 

WDKKPRWFRIMSEKITRTLQILLEWPEEDKL 


978 


2328 


A 


8261 


2 


2165 


RGGSLRCVLGKLLGQLLCFQSERCVRFPEGLL 

RHRGCGLLSSRLSAGKPPLRTSFFGSWGVLPP 

LADAASMSGVRAVRISIESACEKQVHEVGLD 

GTETYLPPLSMSQNLARLAQRIDFSQGSGSEE 

EEAAGTEGDAQEWPGAGSSADQDDEEGVVK 

FQPSLWPWDSVRNNLRSALTEMCVLYDVLSI 

VRDKKFMTLDPVSQDALPPKQNPQTLQLISK 

KKSLAGAAQ1LLKGAERLTKSVTENQENKLQ 

RDFNSELLRLRQHWKLRKVGDKJLGDLSYRS 

AGSLFPHHGTFEVDCNTDLDLDKK1PEDYCPL 

D VQIPSDLEG SAYIKVSIQKQAPDIGDLGTVN 

LFKRPLPKSKPGSPH WQTKLEAAQN VLLCKEI 

FAQLSREAVQIKSQVPHIWKNQHSQPFPSLQ 

LSISLCHSSNDKJCSQKFATEKQCPEDHLYVLE 

HNLHLLIREFHKQTLSSIMMPHPASAPFGHKR 
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NO: of 
nucl- 
eotide 
seq- 
uence 


NO: of 
peptide 
seq- 
uence 


1V1CL 

hod 


qpo 

ID NO: 
in 

USSN 
09/496 
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r ICUltACU 
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D=Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine J G=Glycine, H=Histidine, 
Msoleucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S^Serine, 
^Threonine, V=Valine, W=Tryptophaa 
Y=Tyrosine a X=Unknown, *=Stop codon, 
/ := DOSsible nucleotide deletion. \=oossible 
nucleotide insertion 














MRLSGPQAFDKNEINSLQSSEGLLEKIIKQAK 

HIFLRSRAAATIDSLASRIEDPQIQAHWSNIND 

VYESSVKVLITSQGYEQICKSIQLQLNIGVEQI 

RWHRDGRV ITLSY QEQELQDFLLSQMSQHQ 

VHAVQQLAKVMGWQVLSFSNHVGLGPIESIG 

NASAITVASPSGDYAISVRNGPESGSKLMVQF 

PRNQCKDLPKSDVLQDNKWSHLRGPFKEVQ 

WNKMEGRNFVYKMELLMSALSPCLL 


979 


2329 


A 


8289 


2 


1053 


FVWNPRGGRKRRRQAAVTQAATRASGTPSP 
RDGTMTQGKLSVANKAPGTEGQQQVHGEKK 

P A P A VP Q A PPQ V PP A T<lflVfl\AK A H A PPP A PT A 

VPLHPSWAYVDPSSSSSYDNGFPTGDHELFTT 

FSWDDQKVRRVFVRKVYTILnQLLVTLAVV 

ALFTFCDPVKDYVQANPGWYWASYAVFFAT 

YLTLACCSGPRRHFPWNLILLTVFTLSMAYLT 

GMLSSYYNTTSVLLCLGITALVCLSVTVFSFQ 

TKFDFTSCQGVLFVLLMTLFFSGLILAILLPFQ 

YVPWLHAVYAALGAGVFTLFLALDTQLLMG 

NRRHSLSPEEYIFGALNTYLDirYIFTFFLQLFG 

TMT)T? 

1 l\K_b 


980 


2330 


A 


8305 


59 


857 


ASQLPDYSISPPSLPPRISFHPSPTLARVAMAEP 

SEATQSHSISSSSFGAEPSAPGGGGSPGACPAL 

GTKSCSSSCAVHDLIFWRDVKKTGFVFGTTLI 

MLLSLAAFSVISWSYLILALLSVTISFRIYKSV 

IQAVQKSEEGHPFKAYLDVDITLSSEAFHNY 

MNAAMVHINRALKLIIRLFLVEDLVDSLKLA 

VFMWLMTYVGAVFNG1TLL1LAELL1FSVPIV 

YEKYKTQIDHYVGIARDQTKSIVEKIQAKLPG 

IAKKKAE 


981 


2331 


A 


8308 


186 


1337 


TRMSRHEGVSCDACLKGNFRGRRYKCLICYD 

YDLCASCYESGATTTRHTTDHPMQCILTRVD 

FDLYYGGEAFSVEQPQSFTCPYCGKMGYTET 

r^PTJVT^PT-TAPTQTPVTr'Pir'A AT P^nriPMU 

VTDDFAAHLTLEHRAPRDLDESSGVRHVRR 

MFHPGRGLGGPRARRSNMHFTSSSTGGLSSS 

QSSYSPSNREAMDPIAELLSQLSGVRRSAGGQ 

LNSSGPSASQLQQLQMQLQLERQHAQAARQ 

QLETARNATRRTNTSSVTTTITQSTATTNIAN 

TESSQQTLQNSQFLLTRLNDPKMSETERQSM 

ESERADRSLFVQELLLSTLVREESSSSDEDDR 

GEMADFGAMGCVDIMPLDVALENLNLKESN 

KGNEPPPPPL 


982 


2332 


A 


8315 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLV 

A A AT T VHPTT PI TP<3PfiP. A A^AHOPPI T-TlsJFPI 

AGAGRVAQPGPLEPEEPRAGGRPRRRRDLGS 
RLQAQRRAQRVAWAEADENEEEAVILAQEE 
EGVEKPAETHLSGKIGAKKLRKLEEKQARKA 
QREAEEAEREERKRLESQRPAEWKKEEERLR 
LEEEQKEEEERKAREEQAQREHEEYLKLKEA 
FWEEE GV GETMTEEOSOSFLTEFIN Y1KOSK 
VVLLEDLASQVGLRTQDTINRJQDLLAEGTIT 
GVDDDRGKFIYITPEELAAVANFIRQRGRVSIA 
ELAQASNSLIAWGRESPAQAPA 


983 


2333 


A 


8320 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKP 

DLPTWKRNFRSALNRKEGLRiAEDRSKDPHD 

PHKIYEFVNSGVGDFSQPDTSPDTNGGGSTSD 

TQEDILDELLGNMVLAPLPDPGPPSLAVAPEP 

CPQPLRSPSLDNPTPFPNLGPSENPLKRLLVPG 

EEWEFEVTAFYRGRQVFQQTI SCPEGLRL VGS 
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nucleotide 
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correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



2334 



8321 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



2335 



2336 



8322 



8325 



1243 



352 



89 



529 



Amino acid sequence (A=Alanine C=Cysteine s 
D=Aspartic Acid, E=Ghitamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M=Methionine, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine s X=Unknown, *=Stop codon, 
/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 



1172 



987 



988 



2337 



8326 



2338 



t 8335 



1205 



989 



990 



2339 A 



2340 A 



8349 



8361 



67 



210 



EVGDRTLPGWPVTLPDPGMSLTDRGVMSYV 

RHVLSCLGGGLALWRAGQWLWAQRLGHCH 

TYWAVSEELLPNSGHGPDGEVPKDKEGGVF 

DLGPFIVGSLGPPDLITFTEGSGRSPRYALWFC 

VG ES WPQDQPWTKRL VMVKVVPTCLRAL VE 

MARVGGASSLENTVDLHISNSHPLSLTSDQY 

KAYLQDLVEGMDFQGPGES 

ANMAPVEHWADAGAFLRHAALQDIGKN1Y 

TIREVVTEIRDKATRRRLAVLPYELRFKEPLPE 

YVRLVTEFSKKTGDYPSLSATDIQVLALTYQL 

EAEFVGVSHLKQEPQKVKVSSSIQHPETPLHIS 

GFHLPYKPKPPQETEKGHSACEPENLEFSSFM 

FWRNPLPNIDHELQELLIDRGEDVPSEEEEEEE 

NGFEDRKDDSDDDGGGWITPSNIKQIQQELE 

QCDVPEDVRVGCLTTDFAMQNVLLQMGLHV 

LAVNGMLIREARSYILRCHGCFKTTSDMSRV 

FCSHCGNKTLKKVSVTVSDDGTLHMHFSRNP 

KVLNPRGLRYSLPTPKGGKYAINPHLTEDQRF 

PQLRLSQKARQKTNVFAPDYIAGVSPFVENDI 

SSRSATLQVRDSTLGAGRRRLNPNASRKKFV 

KKR 

RRNNIRQFIMKVCISG QARWLTPVVPVL WET 
EAGRSLELKSLRPAWATWGNPISTKINK 



470 



323 



185 



1115 



KMNPTDIADTTLDESIY SNYYLYESIPKPCTKE 

GIKAFGELFLPPLYSLVFVFGLLGNSWVLVL 

FKYKRLRSMTDVYLLNLAISDLLFVFSLPFWG 

YYAADQWVFGLGLCKMISWMYLVGFYSGIF 

FVMLMSIDRYLAJVHAVFSLRARTLTYGVITS 

LATWSVAVFASLPGFLFSTCYTERNHTYCKT 

KYSLNSTTWKVLSSLEINILGLVIPLGIMLFCY 

SMHRTLQHCKNEKKNKAVKMIFAVVVLFLG 

FWTPYNIVLFLETLVELEVLQDCTFERYLDYA 

IQATETLAFVHCCLNPIIYFFLGEKFRKYILQL 

FKTCRGLFVLCQYCGLLQIYSADTPSSSYTQS 

TMDHDLHDAL 

SLSAMRFLAATFLLLALSTAAQAEPVQFKDC 

GSVDGVIKEVNVSPCPTQPCQLSKGQSYSVN 

VTFTSNIQSKSSKAWHGILMGVPVPFPIPEPD 

GCKSGINCPIQKDKTYSYLNKLPVKSEYPSIK 

LWEWQLQDDKNQSLFCWEIPVQIVSHL 

VIKMALAARLLPQFLHSRSLPCGAVRLRTPA 

VAEVRLPSATLCYFCRCRLGLGAALFPRSAR 

ALAASALPAQGSRWPVLSSPGLPAAFASFPAC 

PQRSYSTEEKPQQHQKTKNIIVLGFSNPINWV 

RTRIKAFL IWAYFDKEFSITEFSEGAKQAFAH 

VSKLLSQCKFDLLEELVAKEVLHALKEKVTS 

LPDNHKNALAANIDEIVFTSTGDISIYYDEKG 

RKFVNILMCFWYLTSANIPSETLRGASVFQVK 

LGNQNVETKQLLSASYEFQREFTQGVKPDWT 

IARIEHSKLLE . 

MSGFIHQLLIQNLFCVYHTRLKTSQGLCLLSL 

KSLHPMS 

ASPFLRPQGHDSGEREPFSQTPGLMQPFSEPVQ 

ITLQGSRRRQGRTAFPASGKKRETDYSDGDPL 

DVHKRLPSSTGEDRAVMLGFAMMGFSVLMF 

FLLGTTILKPFMLSIQREESTCTAJHTDIMDDW 

LDCAFTCGVHCHGQGKYPCLQVFVNLSHPG 

QKALLHYNEEAVQINPKCFYTPKCHQDRNDL 

LNSALDIKEFFDHKNGTPFSCFYSPASQSEDVI 
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eotide 






USSN 
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I^Isoleucine. K^Lvsine. L=Leucine. 


seq- 


uence 




09/496 


correspond i 


to last amino 


M=Methionine, N^Asparagine, P=Proline s 


uence 






914 


ng to first 
amino acid 
residue of 
peptide 
sequence 


acid residue 
of peptide 
sequence 


Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine T V=Valine, W=Tryptophan, 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/=possibIe nucleotide deletion, V=possible 
nucleotide insertion 














LIKKYDQMAIFHCLFWPSLTLLGGAXIVGMV 














RLTQHLSLLCEKYSTVVRDEVGGKVPYIEQH 














QFKLCIMRRSKGRAEKS 


991 


2341 


A 


8369 


9 


921 


S S WEFSALS VSMACLSPSOLOKFOODGFLVI 

EGFLSAEECVAMQQRJGEIVAEMDVPLHCRT 

EFSTQEEEQLRAQGSTDYFLSSGDKIRFFFEK 

GVFDEKGNFLVPPEKSINKIGHALHAHDPVFK 

SITHSFKVQTLARSLGLQMPWVQSMYIFKQP 

HFGGEVSPHQDASFLYTEPLGRVLGVW1AVE 

DATLENGCLWFIPGSHTSGVSRRMVRAPVGS 

APGTSFLGSEPARDNSLFVPTPVQRGALVLIH 

GEVVHKSKQNLSDRSRQAYTFHLMEASGTT 

W SPENWLQPT AELPFPQLYT 


992 


2342 


A 


8370 


906 


4 


MAT ^fiNP^RYYPRFOf^AVPTsJSFPFVVFT >JV 

GGQVYFTRHSTLISPHSLLWKMFSPKRDTAN 

DLAKDSKGRFFIDRDGFLFRYILDYLRDRQVV 

LPDHFPEKGRLKREAEYFQLPDLVKLLTPDEI 

KQSPDEFCHSDFEDASQGSDTRICPPSSLLPAD 

RKWGFITVGYRGSCTLGREGQADAKFRRVPR 

ILVCGRISLAKEVFGETLNESROPDRAPERYTS 

RFYLKFKHLMGAPASNFILGFWGLGQNQDK 

HPVMYLQQRSVIRPDLTSKKAGDLKGKGDA 

QEVSRRRRWLGDPEHL 


993 


2343 


A 


8379 


1 


2794 


MRMQRHKNDTMDFGDSGKRIGGGVLCLLHQ 

SNTSFDCLNNNGFEDIVIVIDPSVPEDEKIIEQIE 

DMVTTA STYLFEATEKRFFFKN VSILIPENWK 

ENPQYKRPKHENHKHADVIVAPPTLPGROEP 

YTKQFTECGEKGEYIHFTPDLLLGKKQNEYG 

PPGKLFVHEWAHLRWGVFDEYNEDQPFYRA 

KSKKIEATRCSAGISGRNRVYKCQGGSCLSRA 

CRIDSTTKLYGKDCQFFPDKVQTEKASIMFM 

QSIDSWEFCNEKTHNQEAPSLQNIKCNFRST 

WEVISNSEDFKNTIPMVTPPPPPVFSLLKIRQRJ 

VCLVLDKSGSMGGKX>RLNRMNQAAKHFLLQ 

TVENGSWVGMVHFDSTATTVNKLIQIKSSDER 

NTLMAGLPTYPLGGTSICSGIKYAFQVIGELH 

SQLDGSEVLLLTDGEDNTASSCIDEVKQSGAI 

VHFIALGRAADEAVIEMSKITGGSHFYVSDEA 

QNNGLIDAFGALTSGNTDLSQKSLQLESKGLT 

LNSNAWMNDTVI1DSTVGKDTFFLITWNSLPP 

SISLWDPSGTIMENFTVDATSKMAYLSIPGTA 

KVGTWAYNLQAKANPETLTITVTSRAANSSV 

PPITVNAKMNKDVNSFPSPMTVYAEILQGYVP 

VLGANVTAFIESONGHTFVLE1 1 DNGAGAD*! 

FKNDGVYSRYFTAYTENGRYSLKVRAHGGA 

MTARLKLRPPLNRAA Y IPGWV VNGEIE ANPP 

RPEIDEDTQTTLEDFSRTASGGAFWSQVPSL 

PLPDQYPPSQITDLDATVHEDKI1LTWTAPGD 

NFDVGKVORYIIR1SASII DLRDSFDDAI OVN 

TTDLSPKEANSKESFAFKPENISEENATHIFIAI 

KSIDKSNLTSKVSNIAQVTLFIPQANPDDIDPT 

PTPTPTPTPDKSHNSGVNISTLVLSVIGSVVIV 

NFiLSTTI 


994 


2344 


A 


8385 


231 


644 


INSSPRTGRDHQELNLHTERDSRSQRAVLKIP 
RQNPGIFYWIFLPSRSHSASHGSRQRQVSCQG 
TQDEIIJKMRNTFAELKNSLEALSSRMDQAEE 
RJGTQAGVQWRDHGSLQPQPPEFKQCFHLSL 
PSSWDYRACLS 


995 


2345 


A 


8390 


194 


3421 


AWRKSSWPPRGTRRGEKSDQDKSGQKNKR 
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Amino acid sequence (A=Alamne C=Cysteine, 
0= Asp arti c Acid, b=Urutamic Acia, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V- Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














DFLSMKQSPALAPEERCRRAGSPKPVLRADD 

NNMGNGCSQKLATANLLRFLLLVL1PCICALV 

LLLEILLSYVGTLQKVYFKSNGSEPLVTDGEl 

QGSDVILTNTIYNQSTWSTAHPDQHVPAWT 

TDASLPGDQSHRNTSACMN1THSQCQMLPYH 

ATLTPLLSWRNMEMEKFLKFFTYLHRLSCY 

QHIMLFGCTLAFPECIIDGDDSHGLLPCRSFCE 

AAKEGCESVLGMVNYSWPDFLRCSQFRNQT 

ESSNVSRICFSPQQENGKQLLCGRGENFLCAS 

GICIPGKLQCNGYNDCDDWSDEAHCNCSENL 

FHCHTGKCLNYSLVCDGYDDCGDLSDEQNC 

DCNPTTEHRCGDGRCIAMEWVCDGDHDCVD 

KSDEVNCSCHSQGLVECRNGQCIPSTFQCDG 

DEDCKDGSDEENCSV1QTSCQEGDQRCLYNP 

CLDSCGGSSLCDPNNSLNNCSQCEPITLELCM 

NLPYNSTSYPNYFGHRTQKEASISWESSLFPA 

LVQTNCYKYLMFFSCTILVPKCDVNTGEHIPP 

CRALCEHSKERCESVLG1VGLQWPEDTDCSQ 

FPEENSDNQTCLMPDEYVEECSPSHFKCRSGQ 

CVLASRRCDGQADCDDDSDEENCGCKERDL 

WECPSNKQCLKHTVICDGFPDCPDYMDEKN 

CSFCQDDELECANHACVSRDLWCDGEADCS 

DSSDEWDCVTLSINVNSSSFLMVHRAATEHH 

VCADGWQEILSQLACKQMGLGEPSVTKL1QE 

QEKEPRWLTLHSNWESLNGTTLHELLVNGQS 

CESRSKISLLCTKQDCGRRPAARMN KKILUUK 

TSRPGRWPWQCSLQSEPSGHICGCVLIAKKW 

VLTVAHCFEGRENAAVWKWLGINNLDHPS 

VFMQTRFVKTIILHPRYSRAWDYDISIVELSE 

DISETGYVRPVCLPNPEQWLEPDTYCYITGW 

GHMGNKMPFKLQEGEVRIISLEHCQSYFDMK 

TITTRMICAGYESGTVDSCMGDSGGPLVCEK 

PGGRWTLFGLTSWGSVCFSKVLGPGVYSNVS 

YFVEWIKRQIYIQTFLLN 


996 


2346 


A 


8392 


199 


3085 


KVILSSEMSKTNKSKSGSRSSRSRSASRSRSRS 

FSKSRSRSRSLSRSRKRRLSSRSRSRSYSPAHN 

RERNHPRVYQNRDFRGHNRGYRRPYYFRGR 

NRGFYPWGQYNRGGYGNYRSNWQNYRQAY 

SPRRGRSRSRSPKRRSPSPRSRSHSRNSDKSSS 

DRSRRSSSSRSSSNHSRVESSKRKSAKEKKSSS 

KJDSRPSQAAGDNQGDEVKEQTFSGGTSQDTK 

ASESSKPWPDATYGTGSASRASAVSELSPRBR 

SPALKSPLQSVWRRRSPRPSPVPKPSPPLSST 

SQMGSTLPSGAGYQSGTHQGQFDHGSGSLSP 

SKKSPVGKSPPSTGSTYGSSQKEESAASGGAA 

YTKRYLEEQKTENGKDKEQKQTNTDKEKJKE 

KGSFSDTGLGDGKMKSDSFAPKTDSEKPFRG 

SQSPKRYKLRDDFEKKMADFHKEEMDDQDK 

DKAKGRKESEFDDEPKFMSKVIGANKNQEEE 

KSGKWEGLVYAPPGKEKQRKTEELEEESFPE 

RSKKEDRGKRSEGGHRGFVPEKNFRVTAYK 

AVOFKSSSPPPRKTSESRDKLGAKGDFPTGKS 

SFSITREAQVNVRMDSFDEDLARPSGLLAQER 

KLCRDLVHSNKKEQEFRSIFQMQSAQSQRSP 

SELFAQH1VTIVHHVKEHHFGSSGMTLHERFT 

KYLKRGTEQEAAKNKKSPEIHRRIDISPSTTRK 

HGLAHDEMKSPREPGYKAEGKYKDDPVDLR 

LDIERRKKHKERDLKRGKSRESVDSRDSSHSR 

ERSAEKTEKTHKGSKKQKXHRRARDRSRSSS 

SSSQSSHSYKAEEYTEETEEREESTTGFDKSRL 
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Ammo acta sequence ^/\~/viaiune i_ysieine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H*=Histidine, 
1-lsoleucine, K^Lysine, L=Leucme, 
M=Methionine, N=Asparagine, P=Proline, 

ftrfilTitnttiinp R=Aroirtinp *\=Sf»rinf» 

T=Threonine, V^Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide iti^prtion 














GTKDFVGPSERGGGRARGTFQFRARGRGWG 

RGNYSGNNNNNSNNDFQKRNREEEWDPEYT 

PKSKKYYLHDDREGEGSDKWVSRGRGRGAF 

PRGRGRFMFRKSSTSPKWAHDKFSGEEGEIE 

DDESGTFNRFEKDNIOPTTE 


997 


2347 


A 


8398 


202 


552 


CPALGGRQDLQGTRLLWAHDSGVGGQKAKS 
KQENLESLEATGREEEGGQGPPVTTKGVLLA 
LLMAGLALQPGTALLCYSCKAQVSNEDCLQ 
VENCTQLGEQCWTARIREWGDDSRQA 


998 


2348 


A 


8400 


697 


301 


NPPSACTPGSCDSCSGRGRDLAFDSVWSTNN 

MSDPRRPNKVLRYKPPPSECNPALDDPTPDY 

MNLLGMIFSMCGLMLKLKWCAWVAVYCSFI 

SFANSRSSEDTKQMMSSFMLSISAVVMSYLQ 

NPQPMTPPW 


999 


2349 


A 


8401 


93 


1126 ' 


ASASHITSGHLRCFPGSEGVGTMARCFSLVLL 

T TOI Y1/T"TT>T T \/APPT t> A T?T?T OTA\7Pr>t>n JA ITI 

L1MW I IKLLVyOoLKAJ^L^lQVoCRlMOITL 

VSKKANQQLNFTEAKEACRLLGLSLAGKDQ 

VETALKASFETCSYGWVGDGFVV1SRISFNPK 

CGKNGVGVLIWKVPVSRQFAAYCYNSSDTW 

TNSClPEnTTKDPIFNTQTATQTTEFIVSDSTYS 

VASP YSTIPAP 11 1 PPAPASTSIPRRKKLICVTE 

VFMETSTMSTETEPFVENKAAFKNEAAGFGG 

VPTALLVLALLFFGAAAGLGFCYVKRYVKAF 

PFTNKNQQKEMIETKWKEEKANDSNPNEES 


1000 


2350 


A 


8406 


2 


777 


KERCQF WKPMLSTVGSFLQDLQNEDKG IKT 
AAIFTADGNMISASTLMDILLMNDFKLVINKI 
AYDVQCPKREKPSNEHTAEMEHMKSLVHRL 
FTILHLEESQKKRRHHLLEKJDHLKEQLQPLE 
QVKAGIEAHSEAKTSGLLWAGLALLSiQGGA 
LAWLTWWVYSWDIMEPVTYFITFANSMVFF 
AYFIVTRQDYTYSAVKSRQFLQFFHKKSKQQ 
HFDVQQYNKLKEDLAKAKESLKQARHSLCL 
QMQVEELNEKN 


1001 


2351 


A 


8410 


1400 


264 

- 


VGFWERPLRSSRWFRRSLRRWEMLARAARG 
TGALLLRGSLLASGRAPRRASSGLPRNTVVLF 

VPQQEAWVVERMGRFHRILEPGLN1LIPVLDR 

TDv\/r\ci vxn\n\j\/n'cr\Q a \/tt rvkxi/Ti r\Tnr \ / 
IKY VybLKJilVlM VrfcA^bAV ILDNV IL^JDCjV 

LYLRIMDPYKASYGVEDPEYAVTQLAQTTM 

RSELGKLSLDKVFRERESLNASIVDAINQAAD 

CWGIRCLRYEIKDIHVPPRVKESMQMQVEAE 

RRKRATVLESEGTRESAINVAEGKKQAQILAS 

EAEKAEQINQAAGEASAVLAKAKAKAEAIRI 

LAAALTQHNGDAAASLTVAEQYVSAFSKLA 

KDSNT1LLPSNPGDVTSMVAQAMGVYGALT 

KAPVPGTPDSLSSGSSRDVQGTDASLDEELDR 

V ISJVlO 


1002 


2352 


A 


8421 


134 


941 


NRENLLESRMMDPCSVGVQLRTTNECHKTY 
YTRHTGFKTLQELSSNDMLLLQLRTGNfTLSG 

AKKNLH VIDLDDATFLSAKFGRQL VPG WKLC 

PKCTQIINGSVDVDTEDRQKRKPESDGR'JAlK 

ALRSLQFTNPGRQTEFAPETGKREKRRLTKN 

ATAGSDRQVIPAKSKVYDSQGLLIFSGMDLC 

DCLDEDCLGCFYACPACGSTKCGAECRCDRK 

WLYEQIEIEGGEIIHNKHAG 


1003 


2353 


A 


8427 


3 


1416 


TEWGLSG SCPGCSPLEPGSRGRG AAAWRILR 

CRRLPEPSPFLTQPNLAQSQPPAPVPVTDPSVT 

MHPAVFLSLPDLRCSLLLLVTWVFTPVTTE1T 
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Amino acid sequence (A^Alanine C=Cysteine s 
D—Aspartic Acid, L— Ulutamic Acia, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, 
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T=Threonine, V=V aline, W~Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/=possibIe nucleotide deletion, \=possible 
nucleotide insertion 














SLDTENIDEILNNADVALVNFYADWCRFSQM 
LHPIFEEASDVIKEEFPNENQVVFARVDCDQH 
SDIAQRYRISKYPTLKLFRNGMMMKREYRGQ 

T>o\/f at a nvtt) r*Avor\DiriCTDni A C TT*n 
RSVKAJLAJJ Y IK^^KblJr l^fcitvULAill I IIjUKS 

KRNIIGYFEQKDSDNYRVFERVANILHDDCAF 

LSAFGDVSKPERYSGDNIIYKPPGHSAPDMVY 

LGAMTNFDVTYNWIQDKCVPLVREITFENGE 

ELTEEGLPFLILFHMKEDTESLEIFQNEVARQL 

ISEKGTINFLHADCDKFRHPLLHIQKTPADCP 

VIAIDSrRHMYVruUrKUVLlrol^Kyr Vr JJL 

HSGKLHREFHHGPDPTDTAPGEQAQDVASSP 

PESSFQKLAPSEYRYTLLRDRDEL 


1004 


2354. 


A 


8432 


910 


387 


GLSRKLRAGFLPGFCRVSPCGSWVVETLVKM 

ACAAARSPADQDRFICIYPAYLNNKKTIAEGR 

RI PI SKA VENPTATbl ^ D V C i> A V O L N V r L bK. N 

KMYSREWNRDVQYRGRVRVQLKQEDGSLC 

LVQFPSRKSVMLYAAEMIPKLKTRTQKTGGA 

DQSLQQGEGSKKGKGKKKK 


1005 


2355 


A 


8453 


90 


530 


QSHETKMQSGTHWRVLOLCLLovu V WuvjU 

GNEEMGGITQTPYKVSISGTTVILTCPQYPGSE 

ILWQHNDKNIGGDEDDKNIGSDEDHLSLKEF 

SELEQSGYYVCYPRGSKPEDANFYLYLRARG 

NPGLQNRYHRLFREDHSKGHSQ 


1006 


2356 


A 


8458 


3 


307 


AVQRJRHEMNIFRLTGDLSHLAAIVILLLKIW 
KTRSCAGISGKSQLLFALVr I lRYLDLr lbrii> 
LYNTSMKVWYAIHRNVFHLQCTGLWTLNLC 
QLCIFN 


1007 


2357 


A 


8459 


43 


553 


GAGAGGDWAAMDKLKKVLSGQDTEDRSGL . 

SEVVEASSLSWSIRlKGrlACr AluJLLbLLu 1 

VLLWVPRKGLHLFAVFYTFGNIASIGSTIFLM 

GPVKQLKRMFEPTRLIATIMVLLCFALTLCSA 

FWWHNKGLAL1FCILQSLALTWYSLSFIPFAR 

DAVKKCFAVCLA 


1008 


2358 


A 


8462 


487 


150 


AQDIRSVHSLGQKSTFVKHFRTLSHLHGLPDP 
PPHWPPQERSPPSHPCMPSHRPQIPQLSNSGPS 
DPRWGCVGPSMPTSTCLPGAVEASTTKASLP 
KCPVDSSLPTPEACFL 


1009 


2359 


A 


8465 


134 


954 


b IK VK 1 sLELLkT^LEPTGT V G NTI M TS QP VP 

NETIIVLPSNVINFSQAEKPEPTNQGQDSLKKH 

LHAEIKVIGTIQILCGMMVLSLGIILASASFSPN 

FTQVTSTLLN SAYPFIGPFFFIISGSLSI ATEKRL 

TKLLVHSSLVGS1LSALSALVGF11LSVKQATL 

NPAibLQCbLDKNNlr IKoi VbYr YHUbLr i 1 ID 

CYTAKASLAGTLSLMLICTLLEFCLAVLTAVL 

RWKQAYSDFPGSVLFLPHSYIGNSGMSSKMT 

HDCGYEELLTS 


1010 


2360 


A 


8468 

i 


2 


473 


KYRYRRPYPVMRKJCQVGPAGLAF1LN1SPVA 
HRV ALCHLA U CQbQ AA W Y H 1 LQ1 L r r L V 5> A Y 
FFSCP VPEKYFPGSCDI VGHG H QIFH AFL SICT 
LSQLEAILLDYQGRQEIFLQRHGPLSVHMACL 
S F FF1 , AACS AATAALL RHKVKARLTKKD S 


1011 


2361 


A 


8478 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMT 

GTLETQFTCPFCNHEKSCDVKMDRARNTGVI 

SCTVCLEEFQTPITCILGNLGFFQRVGRGLESG 

PCSSGPLCALVQGQSRPEEQVPPSDFCGVRRC 

RAGFQCQ 


1012 


2362 


A 


8481 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCNPENDRM 
RMKY GGQEFW ADLN AMNVYETTEFDQLRR 
LSTPPSSNVNSIYHTVWKFFCRDHFGWREYPE 
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Amino acid sequence (A=AIanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, G=GIycine, H-Histidine, 
I=Isoleucine, K=Lysine. L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
QKjIutamine, R=Arginine, S^Serine, 
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Y=Tyrosine, X^Un known, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














SVIRLIEEANSRGLKEVRFMMWNNHYILHNS 

FFRREIKRRPLFRSCFILLPYLQTLGGVPTQAP 

PPLEATSSSQIICPDGVTSANFYPETWVYMHP 

SQDFIQVPVSAEDKSYRHYNLFHKTVPEFKYR 

ILQILRVQNQFLWEKYKRKKEYMNRKMFGR 

DRir^RHLFHGTSQDVVDGICKHNFDPRVCG 

KHATMFGQGSYFAKKASYSHNFSKKSSKGV 

HFMFLAKVLTGRYTMGSHGMRRPPPVNPGS 

VTSDLYDSCVDNFFEPQIFVIFNDD QS YPYFVI 

QYEEVSNTVSI 


1013 


2363 


A 


8488 


2 


517 


IENCRTRLRQAWHEVCGNKMAAPIPQGFSCL 
SRFLG W WFRQPVL VTQS AM VPVRTKKRFTP 
PIYQPKFKTEKEFMQHARKAGLVIPPEKSDRS 
IHLACTAGIFDAYVPPEGDARISSLSKEGLIER 

TERMKKTMASQVSIRRJKDYDANFKJKDFPE 
KAKDIFIEGSPLY 


1014 


2364 


A 


8501 


363 


17 


YIRTGYVYICnYAQLMYTYYIRTAYVYICILY 
AQLMYTYVLYTHSLCIHMYSIRTAYVYICnY 
AQIMYTYVFYTHRLClHMYSIRTDYVYiaLY 
AQLMYTYVFYTHSYMSDE 


1015 


2365 


A 


8504 


3 


2190 


NSSEHFSQAPQRLSFY S WYGSARLFRFRVPPD 

AVLLRWLLQVSRESGAACTDAEITVHFRSGA 

PPVINPLGTSFPDDTAVQPSFQVGVPLSTTPRS 

NASVNVSHPAPGDWFVAAHLPPSSQKIELKG 

LAPTCAYVFQPELLVTRWEISIMEPDVPLPQ 

TLLSHPSYLKVFVPDYTRELLLELRDCVSNGS 

LGCPVRLTVGPVTLPSNFQKVLTCTGAPWPC 

RLLLPSPP WDRWLQ VTAESL VGPL GTVAFSA 

VAALTACRPRSVT1QPLLQSSQNQSFNASSGL 

LSPSPDHQDLGRSGRVDRSPFCLTNYPVTRED 

MDWSVHFQPLDRVSVRVCSDTPSVMRLRL 

NTGMDS GGSLTISLRANKTEMRNETVVVACV 

N AASPFLG FNTSLNCTTAFFQG YPLSL SAWSR 

RANLIIPYPETDNWYLSLQLMCPENAEDCEQ 

AWHVETTLYLVPCLNDCGPYGQCLLLRRHS 

YLYASCSCKAGWRGWSCTDNSTAQTVAQQR 

AATLLLTLSNLMFLAPIAVSVRRFFLVEASVY 

AYTMFFSTFYHACDQPGEAVLCILSYDTLQY 

CDFLG SGAAI WVTILCMARLKTVLKYVLFLL 

GTLVIAMSLQLDRRGMWNMLGPCLFAFVIM 

ASMWAYRCGHRRQCYPTSWQRWAFYLLPG 

VSMASVGIAIYTSMMTSDNYYYTHSIWHILL 

AGSAALLLPPPDQPAEPWACSQKFPCHYQIC 

KNDREELYAVT 


1016 


2366 


A 


8511 


1 


453 


KWYPSGPVRIPGRFYYKLPAGHRRCRMAPAK 

KGGEIvXKGRSAINEVVTREYTINIHKRIHGVG 

FKKRAPRALKEIRKFAMKEMGTPDVRIDTRL 

NKAVWAKGIRNVPYRIRVRLSRKRNEDEDSP 

NKLYTLVTYVPVTTFKNLQTVNVDEN 


1017 


2367 


A 


8513 


54 


1196 


LERTPA SADMA WTK YQLFLAGLML VTGS INT 

LSAKWADNFMAEGCGGSKEHSFQHPFLQAV 

GMFLGEFSCLAAFYLLRCRAAGQSDSSVDPQ 

QPFNPLLFLPPALCDMTGTSLMYVALNMTSA 

SSFQMLRGAVIIFTGLFSVAFLGRRLVLSQWL 

GILATIAGLVWGLADLLSKHDSQHKLSEVn 

GDLLIIMAQIIVAIQMVLEEKFVYKHNVHPLR 

AVGTEGLFGFVILSLLLVPMYYIPAGSFSGNP 

RGTLEDALDAFCQVGQQPLIAVALLGNISSIA 

FFNFAGISVTKELSATTRMVLDSLRTWIWAL 
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F=PhenylaIanine, G^Glycine, H=Histidine, 
I^Isoleucine, K=Lysine. L=Leucine, 
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/=possible nucleotide deletion, \=possible 
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SLALGWEAFHALQILGFL1LUGTALYNGLHR 
PLLGRLSRGRPLAEESEQERLLGGTRTPINDA 
S 


1018 


2368 


A 


8518 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALV 

VSGGIVGYVKTGSVPSLAAGLLFGSLAGLGA 

YQLYQDPRNVWGFLAATSVTFVGVMGMRS 

YYYGKFMPVGLIAGASLLMAAKVGVRMLM 

TSD 


1019 


2369 


A 


8526 


2 


1787 


VSAAAVNMEPPDAPAQARGAPRLLLLAVLL 

AAHPDAQAEVRLSVPPLVEVMRGKSVILDCT 

PTGTHDHYMLEWFLTDRSGARPRLASAEMQ 

GSELQVTMHDTRGRSPPYQLDSQGRLVLAEA 

QVGDERDYVCWRAGAAGTAEAAARLNVF 

AKPEATEVSPNKGTLSVMEDSAQEIATSNSRN 

GNPAPKITWYRNGQRLEVPVEMNPEGYMTS 

RTVREASGLLSLTSTLYLRLRKDDRDASFHC 

AAHYSLPEGRHGRLDSPTFHLTLHYPTEHVQ 

FWVGSPSTPAGWVREGDTVQLLCRGDGSPSP 

EYTLFRLQDEQEEVLNVNLEGNLTLEGVTRG 

QSGTYGCRVEDYDAADDVQLSKTLELRVAY 

LDPLELSEGKVLSLPLNSRAWNCSVHGLPTP 

ALRWTKDSTPLGDGPMLSLSSITFDSNGTYVC 

EASLPTVPVLSRTQNFTLLVQGSPELKTAEIEP 

KADGSWREGDEVTLICSARGHPDPKLSWSQL 

GGSPAEPIPGRQGWVSSSLTLKVTSALSRDGI 

SCEASNPHGNKRHVFHFGTVSPQTSQAGVAV 

MAVAVSVGLLLLVVAVFYCVRRKGGPCCRQ 

RREKGAP 


1020 


2370 


A 


8530 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHS 

SPLLPHAMK SPF YRCQNTTSVEKGNS AVMG G 

VLFSTGLLGNLLALGLLARSGLGWCSRRPLR 

PLPSVFYMLVCGLTVTDLLGKCLLSPVVLAA 

YAQNRSLRVLAPALDNSLCQAFAFFMSFFGL 

SSTLQLLAMALECWLSLGHPFFYRRHITLRLG 

ALVAPWSAFSLAFCALPFMGFGKFVQYCPG 

TWCF1QMVHEEGSLSVLGYSVLYSSLMALLV 

LATVLCNLGAMRNLYAMHRRLQRHPRSCTR 

DCAEPRADGREASPQPLEELDHLLLLALMTV 

LFTMCSLPVIYRAYYGAFKDVKEKNRTSEEA 

EDLRALRFLSVISIVDPWIFIIFRSPVFRIFFHKI 

FIRPLR YR SRC SNSTNMEbaL 


1021 


2371 


A 


8536 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEKPR 
KHDSGAADLERVTDYAEEKEIQSSNLETAMS 
VIGDRRSREQKAKQER 


1022 


2372 


A 


8537 


94 


541 


RKERRRRRRRMEAVVFVFSLLDCCALIFLSV 

WIITLSDLECDYINARSCCSKLNKWV1PEL1G 

HTIVTVLLLMSLHWF1FLLNLPVATWNIYRYI 

MVPSGNMGVFDPTEIHNRGQLKSHMKEAM1 

KLGFHLLCFFMYLYSMILALIND 


1023 


2373 


A 


8540 


26 


431 


RMMKCPQALLATFWLLLSWVSSEDKVVQSPL 

ci \/vMT?r;r»TVTi "NiP^YPArrNFP^i ! WYICOFK 
V V jiUVjI/ 1 v 1 LlNV^o i cv i i^ir i\.ol<1j w i rvv^-crv 

KAPTFLFMLTSSGIEKJCSGRLSSILDKKELSSIL 

NITATQTGDSAIYLCAVEAQCSLVTCSLYSNS 

TAEALQL 


1024 


2374 


A 


8544 


1731 


743 


GVRLRYSPIAVVMVGEAGRDLRRRRAVAVT 
AEKMAVLAPL1ALVYSVPRLSRWLAQPYYLL 
SALLSAAFLL VRKLPPLCHGLPTQREDGNPCD . 
FDWREVE1LMFLSAJVMMKNRRSITVEQHIGN 
IFMFSKVANTILFFRLDIRMGLLYITLCIVFLM 
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Amlflrt HPIfl VnilPn^ / A A lining C^- — .r**- r±\ e\ 

j-\iiimu atiu acijuciiue \/\ — /uonine v^— v^ysteinej 
D^Aspartic Acid, E=G!utamic Acid. 
F=Pheny!aIanine s Glycine, FNHistidine, 
Msoleucine, KHLysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=GIutamine, R=Arginine, S=Serine, 
T— Threonine. V=VaIine W=Trvntnnhan 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^^possible nucleotide deletion, possible 
nucleotide insertion 














TCKPPLYMGPEYIKYFNDKTIDEELERDKRVT"" 
WIVEFFANWSNDCQSFAPIYADL SLK YNCTG 
LNFGKVDVGRYTDVSTRYKVSTSPT TKOI PT 

L1LFQGGKEAMRRPQIDKKGRAVSWTFSEEN 

VIREFNLNELYQRAKKLSKAGDNIPEEQPVAS 

TPTTVSDGENKKDK 


1025 


2375 


A 


8546 


2194 


1707 


TVSFHKTMASLKCSTVVCVICLEKPKYRCPA " 

CRVPYCSWCFRKHKEQCNPETRPVEKKIRS 
ALPTKTVKPVFNKDDDn^lADFT N<?r>FPPnP 

VSLQNLKNLGE SATLRSLLLNPHLRQLMVNL 

IXJGEDKAKLMRAYMQEPLFVEFADCCLGIV 
EPSQNEES 


1026 


2376 


A 


8547 


1078 


594 


VfiMFT PAVXJI k'VTT I fiu«/I T TTW^r , i\/Tjcr , c 
v Kjivirjx^r /\ v ]N L,J\. v lJL<JL»vJrl WjLJL 1 1 W Ov_J VroOo 

YAWANFTILALGVWAVAQRDSIDAISMFLGG 

LLATIFLDIVHISIFYPRVSLTDTGRFGVGMAIL 

SLLLKPLSCCFVYHMYRERGGELLVHTGFLG 

SSQDRSAYQTIDSAEAPADPFAVPEGRSQDAR 

GY 


1027 


2377 


A 


8557 


1 


340 


DFLGPASPQEEGGSESSTMTELETAMGMIIDV 
FSRYSGSEGSTQTLTKGELKVLMEKELPGFLQ 
SGKDKDAVDKLLKDLDAN GDAQVDFSEFI VF 
VAAJTSACHKYFEKAGLK 


1028 


2378 


A 


8569 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLL 

Till nCnnr i Dn/^\fr' A r!ATPrVI r/npnof cisr* 

jULLLUduyury^j VUA(JQ 1 r h YLFCREHSLSKP 
YQGEAPRPCFLRDWELQVHFKIHGQGKKNL 
HGDGL AI W^YTKDRMQPGPVFGNMDKF V GL G 
VFVDTYPNEEKQQERVFPYISAMYNNGSLSY 
DHERDGRFTEL GGCT AI VRNLHYDTFL VIR Y 
VKRHLTIMMDIDGKHEWRDCIEVPGVRLPRG 
Y YFGTSSITGDL SDNHDVISLKLFELT VERTPE 

ALFLIVFFSLVFSVFAIV1GIILYNKWQEQSRK 
RFY 


1029 


2379 


A 


8572 


1 


578 


AAAASHRSRARSRPRRVSSGPAPRRAQSSAG 
RVASGLDSAPLCTMARALCRLPRRGLWLLLA 
HHLFMTTACQEANYGALLRELCLTQFQVDM 
EAVGETLWCDWGRTIRSYRELADCTWHMAE 
KLGCFWPNAEVDRFFLAVHGRYFRS CPISGR 

AVRDPPGSILYPFIWPITV1XLVTALVVWQS 
KRTEGIV 


1030 


2380 


A 


8574 


1352 


372 


DSSTVKGGSESRHLCLIPDLKGKARTREASSG 

THLTITQALRQPLHRAPLLPGQLCWSPRPLEK 

NKAMGRPLLLPLLLLLQPPAFLQPGGSTGSGP 

S YL YG VTQPKHLSASMGG S VEIPFSF YYP WEL 

AIVPNVRISWRRGHFHGQSFYSTRPPSIHKDY 

VNRLFLNWTEGQESGFLRJSNLRKEDQSVYF 
PRVFT DTRR^fiPfwii n^TVfiTVT tttha \;ttt 

TTWRPSS rn'IAGLRVTESKGHSESWHLSLDT 

AIRVALAVAVLKTVILGLIjCLLLLWWRRRKG 
SRAPSSDF 


1031 


2381 


A 


8580 


905 


340 


rrtagiypcfpkpgrtrhalcswlllltgql 

afddfqescammwqkyagsrrsmplgaril 

fhgvfyaggfaivyyliqkfhsralyyklav 

eqlqshpeaqealgpplnihylklidrenfvdi 

vdaklkipvsgsksegllyvhssrggpfqrw 

hldev^elkixjqqipvfklsgengdevicjce 


1032 


2382 


A 


8593 


2558 


961 


RRRPRLLPGAEPCEPRVGPRRADMGCSAKAR ~ 
WAAGALGVAGLLCAVLGAVMIVMVPSLIKQ 



295 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 

• * 

hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 
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nucleotide insertion 








■ 






QVLKNVRIDPSSLSFNMWKEIPIPFYLSVYFFD 

VMNPSEILKGEKPQVRERGPYVYREFRHKSNI 

TFbfNNDTVSFLEYRTFQFQPSKSHGSESDYIV 

MPNILVLGAAVMMENKPMTLKLIMTLAFTTL 

GERAFMNRTVGEIMWGYKDPLVNLINKYFP 

GMFPFKDKFGLFAELNNSDSGLFTGFTGVQNI 

SRIHLVDKWNGLSKVDFWHSDQCNMINGTS 

GQMWPPFMTPESSLEFYSPEACRSMKLMYKE 

SG VFEGI PTYRF V APK TLFAN Go! YPPNEGFCP 

CLESGIQNVSTCRFSAPLFLSHPHFLNADPVL 

AEAVTGLHPNQEAHSLFLDIHPVTGIPMNCSV 

KLQLSLYMKSVAGIGQTGKIEPWLPLLWFA 

ESGAMEGETLH'1'FYTQLVLMPKVMHYAQYV 

LLALGCVLLLVPV1CQIRSQEKCYLFWSSSKK 

p r»T/ TXT/' r A 1 /~\ A "W CPOT APAI/T 

uSKDKhAIQAYbESLM 1 bArKGSVLQEAKL 


1033 


2383 


A 


8595 


595 


767 


AHLPDTLLLPPH SPTVPTPKSFQCSQKACFSRS 
FCLLLSLVSSSLVSLSLCPPLTQA 


1034 


2384 


A 


8597 


640 


164 


VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQ 

PMMQTIGQKYCMDPAVIAGVLSRKSPGDKIL 

VNMGDRTSMVQDPGSQAPTSWISESQVFQTT 

EVLTTRITELQRRFPTWTPDQYLRGGLCAYSG 

GAGYVRSSQDLSCDFCNDVLARAKYLKRHG 

F 


1035 


2385 


A 


8603 


936 


204 


AMASTLEY SPSPLRRL VGPAAGFSRAARADL 

SWDPMAFFTGLWGPFTCVSRVLSHHCFSTTG 

SLSAIQKMTRVRVVDNSALGNSPYHRAPRCI 

HVYKJQvlGVGKVGDQILLAIKGQKKkALrVG 

HCMPGPRMTPRFDSNNVVLIEDNGNPVGTRI 

KTPIPTSLRKREGEYSKVLAIAQNFV 


1036 


2386 


A 


8606 


1 


562 


PTRAHSFDLCCSPCRRKLLGREEAGEEPTSPV 

TQYLQPRSPEECKMFACAKLACTPSLIRAGSR 

VAYRPISASVLSRPEASRTGEGSTVFNGAQNG 

VSQLIQREFQTSAISRDIDTAAKF1GAGAATVG 

VAGSGAGIGTVFGSLIIGYARNPSLKQQLFSY 

AILGFAI.SEAMGLFCLMVAFLfLFAM 


1037 


2387 


A 


8615 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDT 

GMVAHINNSRLKAKGVGQHDNAQNFGNQSF 

EELRAACLRKGELFEDPLFPAEPSSLGFKDLG 

PNSKNVQNISWQRPKDIINNPLFIMDGISPTDI 

CQGILGDC WLLAAIG SLTTCPKLL YRWPRG 

QSFKKNYAGIFHFQIWQFGQWVNVWDDRL 

PTKNDKXVFVHSTERSEFWSALLEKAYAKLS 

GSYEALSGGSTMEGLEDFTGGVAQSFQLQRP 

PQNLLRLLRKAVERSSLMGCSIEVTSDSELES 

MTDKML VRGHAY S VTGLQDVHYRGKMETLI 

RVRKPWGRIEWNGAWSDSAREWEEVASD1Q 

MQLLHKTEEXjEF WM SYQDFLNNFTLLEICNL 

TPDTLSGDYKSYWH'ITFYEGSWRTGSSAGGC 

RNHPGTF WTNPQFKI SLPEGDDPEDD AEGN V 

WCTCLVALMQKNWRHARQQGAQLQTIGFV 

LYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEI 

FTNSREVSSQLRLPPGEYniPSTFEPHRDADFL 

LRVFTEKHSESWELDEVNYAEQLQEEKVSED 

DMDQDFLHLFKIVAGEGKEIGVYELQRLLNR 

MAIKFKSFKTKGFGLDACRCMINLMDKDGSG 

KLGLLEFKILWKKLKXWMDIFRECDQDHSGT 

LNSYEMRLVIEKAGIKLNNKVMQVLVARYA 

DDDLIIDFDSF1SCFLRLKTMF J FFLTMDPKNT 

GHICLSLEQVLGEGWEGICRIAPACPSTPPPPS 
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nucl- 
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seq- 
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SEQID 
NO: of 
peptide 
seq- 
uence 
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hod 


SEQ 
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in 
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Predicted 

beginning 

nucleotide 
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correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine C-Cysteine, 
D=Aspartic Acid, E-GIutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I-Iso leucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S-Serine, 
T=Threonine. V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














SDVPGPASCPRLFPPWDLLPVSTVAADDHVGI 
EAL 


1038 


2388 


A 


8621 


3 


1494 


RSRMARAPLGVLLLLGLLGRGVGKNEELRLY 

HHLFNN YDPG SRPVREPEDTVTI S LK VTLTNL 

I SLNEKEETLTTSVWIGIDWQD YRLNYSKDDF 

GGIETLRVPSELVWLPEIVLENNIDGQFGVAY 

DANVLVYEGGSVTWLPPA1YRSVCAVEVTYF 

PFDWQNCSLIFRSQTYNAEEVEFTFAVDNDG 

KTINKIDIDTEAYTENGEWAIDFCPGVIRRHH 

GGATDGPGETDVIYSLURRKPLFYVINIIVPCV 

LISGLVLLAYFLPAQAGGQKCTVSINVLLAQT 

VFLFL1AQK1PETSLSVPLLGRFLIFVMVVATLI 

VMNCVIVLNVSQRTPTTHAM SPRLRHVLLEL 

LPRLLG SPPPPEAPRAASPPRRASSVGLLLRAE 

ELILKKPRSELVFEGQRHRQGTWTAAFCQSL 

GAAAPEVRCCVDAVNFVAESTRDQEATGEE 

VSDWVRMGNALDNICFWAALVLFSVGSSLIF 

LGAYFNRVPDLPYAPCIQP 


1039 


2389 


A 


8636 


1 


900 


PGRERPG GGGARRRPQHLPALLPSERPDCATL 

QAMENELPVPHTSSSACATSSTSGASSSSGCN 

NSSSGGSGRPTGPQISVYSGIPDRQTVQVIQQ 

ALHRQPSTAAQYLQQMYAAQQQHLMLQTA 

ALQQQHLSS AQLQSL AAVQQASL VSNRQG ST 

SGSNVSAQAPAQSSSINLAASPAAAQLLNRA 

QSVNSAAASGIAQQAVLLGNTSSPALTASQA 

QMYLRAQMLIFTPTATVATVQPELGTGSPAR 

PPTPAQVQNLTLRTQQTPAAAASGPTPTQPVL 

PSLALKPTPGGSQPLPTPA 


1040 


2390 


A 


8645 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSF 

HEHRHQSGRCLSTGMAPNLKGRPRKKKPCPQ 

RRDSFSGVKDSNNNSDGKAVAKVKCEARSA 

LTKPKNNHNCKKVSNEEKPKVAIGEECRADE 

QAFLVALYKYMKERKTPIERIPYLGFKQINLW 

TMFQAAQKLGGYETITARRQWKHIYDELGG 

NPGSTSAATCTRRHYERLILPYERFIKGEEDKP 

LPPDCPRJCQENSSQENENKTKVSGTKR1KHEIP 

KSKKEKENAPKPQDAAEVSSEQEKEQETLISQ 

KSIPEPLPAADMKKKIEGYQEFSAKPLASRVD 

PEKDNETDQGSNSEKVAEEAGEKGPTPPLPSA 

PLAPEKDSALVPGASKQPLTSPSALVDSKQES 

KLCCFTESPESEPQEA SFPRLPHHTGHRWQTR 

MRRRMTNCPPWQ1TLPTAP 


1041 


2391 


A 


8646 


113 


1492 


LLQEMCTKTIPVLWGCFLLWNLYVSSSQTIYP 

GIKARITQRALDYG VQA GMKMIEQMLKEKK 

LPDLSGSESLEFLKVDYVNYNFSNIKISAFSFP 

NTSLAFVPGVGIKALTNHGTANISTDWGFESP 

LFVLYNSFAEPMEKPILKNLNEMLCPIIASEVK 

ALNANLSTLEVLTKIDNYTLLDYSLISSPEITE 

NYLDLNLKGVFYPLENLTDPPFSPVPFVLPER 

SNSMLY1GIAEYFFKSASFAHFTAGVFNVTLS 

TEEISNHFVQNSQGLGNVLSRIAEIYILSQPFM 

VRIMATEPPIINLQPGNFTLD1PASIMN1LTQPK 

NSJVET1VSMDFVASTSVGLV1LGQRLVCSLS 

LNRFRLALPESNRSNIEVLRFENILSSILHFGVL 

PLANAKLQQGFPLPNPHKFLFVNSDIEVLEGF 

LUSTDLKYETSSKQQPSFHVWEGLNUSRQW 

RGKSAP 


1042 


2392 


A 


8672 


538 


170 


ARRI ARTRESKAAVS QDNVPALQPGKJQCKLR 
LGGKJCXKFK^RLPKEFKKQLMYSPSNFKKM 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



1043 



2393 



1044 



2394 



Met 
hod 



1045 



2395 



1046 



1047 



1048 



2396 



2397 



2398 



SEQ 
ID NO: 
in 

USSN 

09/496 

914 



8688 



Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 



8718 



359 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 



17 



292 



1490 



8724 



254 



3184 



8736 



8741 



8747 



28 



452 



673 



924 



5054 



Amino acid sequence (A-Alanine C=Cysteine f 
D=Aspartic Acid, E=Glutamic Acid, 
^phenylalanine, G=Glycine, H=Histidine, 
l«lsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline ; 
Q=Glutamine, R=Argintne, S= Serine, 
T=Threonine, V= Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown. *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 

TSLAGNTVQCLNKLKYVIYSAQYPAYGN1T1 
LDM1TSTDHVLEQDFW1CFTFYSVKERQI 
GLKTRAPATPTFQREVLGPAKQDMQKRCPR1 
GLMTSLLKPIKRRWRDYKRWKSGGFTGESC 
HHADTLGDRGGLQGDHSELLQWQKRILRTE 

GE PSPKY1SKNIFPICSYITGFL 

GTVKTSVATPITAGHSCSSGGVLQVKSPATQS 
GFKFTSKMEDFNMESDSFEDFWKGEDLSNYS 
YSSTLPPFLLDAAPCEPESLEINKYFVV1IYAL 
VFLLSLLGNSLVMLV1LYSRVGRSVTDVYLL 
NLALADLLFALTLPIWAASKVNGWIFGTFLC 
KWSLLKEVNFYSGILLLACISVDRYLAIVHA 
TRTLTQKRYLVKF1CLSIWGLSLLLALPVLLFR 
RTVY SSN V SPACYEDMGNNTAN WRMLLRJL 
PQSFGFIVPLLIMLFCYGFTLRTLFKAHMGQK 
HRAMRVIFAVVLIFLLCWLPYNLVLLADTLM 
RTQVIQETCERRNHIDRALDATE1LGILHSCLN 
PL1YAFIGQKFRHGLLKILAIHGLISKDSLPKDS 

RPS FVGSSSGHTSTTL 

FRANLA1TVANRRGAQGGKMHTCCPPVTLEQ 
DLHRKMHSWMLQTLAFAVTSLVLSCAETIDY 
YGEICDNACPCEEKDGILTVSCENRG1ISLSE1S 
PPRFPI YHLLLSGNLLNRLYPNEFVN YTG A SIL 
HLG SN VIQD1ETGAFHGLRGLRRLHLNNNKL 
ELLRDDTFLGLENLEYLQVDYMYISVIEPNAF 
GKLHLLQVLILNDNLLSSLPNNLFRFVPLTHL 
DLRGNRLKLLPYVGLLQHMDKWELQLEEN 
PWNCSCEL1SLKDWLDS1SYSALVGDVVCETP 
FRLHGRDLDEVSKQELCPRRLISDYEMRPQTP 
LSTTGYLHTTPASVNSVATSSSAVYKPPLKPP 
KGTRQPNKPRVRPTSRQPSKDLGYSNYGPS1A 
YQTKSPVPLECPTACSCNLQ1SDLGLNVNCQE 
RKTESIAELQPKPYNPKKMYLTENYIAWRRT 
DLLEATGLDLLHLGNNRISM1QDRAFGDLTN 
LRRLYLNGNRIERLSPELFYGLQSLQYLFLQY 
NLIREIQSGTFDPVPNLQLLFLNNNLLQAMPS 
GVFSGLTLLRLNLRSNHFTSLPVSGVLDQLKS 
LIQIDLHDNPWDCTCDIVGMKLWVEQLKVG 
VLVDEVICKAPKKFAETDMRSIKSELLCPDYS 
DVVVSTPTPSSIQVP ARTSAVTPAVRLN STGA 
PASLGAGGGASSVPLSVLILSLLLVFIMSVFVA 
AGLFVLVMKRRKKNQSDHTSTNNSDVSSFN 
MQYSVYGGGGGTGGHPHAHVHHRGPALPK 
VKTPAGHVYEY1PHPLGHMCKNPIYRSREGN 
SVEDYKDLHELKVTYSSNHHLQQQQQPPPPP 
QQPQQQPPPQLQLQPGEEERRESHHLRSPAYS 
V STIEPREDLLSPVQD ADRF YRG ILEPDKHCST 
TPAGNSLPEYPKFPCSPAAYTFSPNYDLRRPH 
QYLHPGAGDSRLREPVLYSPPSAVFVEPNRNE 
YLELKAKLNVEPDYLEVLEKQTTFSQF 
SPSAAGGLAWVSLALGSGSRGRDHSGSGVGT 
AMAG ALVRKAADY VRSKDFRD YLMSTHF W 
GPVANWGLP1AAINDMKKSPEIISGRMTFALC 
CYSLTFMRFAYKVQPRNWLLFACHATNEVA 

QLIQGGRLIKHEMTKTASA 

ALPGTPQQTVTLNTDGK VKSFTSPH SNPNLPP 
AKFFTSLQSLNWS SHLPPSPATES VGKRGNAK 

PPTTKLLHSSPLWNFFAQQL 

PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKR | 
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NO: of 
nucl- 
eotide 
seq- 
uence 


SFO ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


ID NO: 
in 

USSN 
09/496 
914 


PirpHtrffvl 

beginning 
nucleotide 
location 
correspond!" 
ng to first 
amino acid 
residue of 
peptide 
sequence 


i ICUHICU CI1U 

nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino aLiu sequence yj\ — /\i <iJ line ^y&icine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H^Histidine, 
l^lsoleucine, K«Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P^Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=V aline, W^Tryptophan, 
Y=Tyrosine, X=Unknown, *:=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














VAVPNGQPPSAARYMPREVPPRFRCQQDHK 

VLLKRGQPPPPSCMLLGGGAGPPPCTAPGAN 

PNNAQVTGALLQSESGTAPDSTLGGAAASNY 

ANSTWGSGASSNNGTSPNPIHIWDKVIVDGS 

DMEEWPCIASKDTESSSENTTDNNSASNPGSE 

KSTLPGSTTSNKGKGSQCQSASSGNECNLGV 

WKSDPKAKSVQSSNSTTENNNGLGNWKNVS 

GQDRIGPGSGFSNFNPNSNPSAWPALVQEGTS 

RKGALETDNSNSSAQVSTVGQTSREQQSKME 

NAGVNFWSGREQAQIHNTDGPKNGNTNSL 

NLSSPNPMENKGMPFGMGLGNTSRSTDAPSQ 

STGDRKTGS VG SWG AARGPSGTDTVSGQSNS 

GNNGNNGKEREDSWKGASVQKSTGSKKDS 

WDNNNRSTGGSWNFGPQDSNDNKWGEGNK 

MTSGVSQGEWKQPTGSDELK1GEWSGPNQPN 

SSTGAWDNQKGHPLLENQGNAQAPCWGRSS 

SSTGSEVEGQSTG SNHKAGS SDSHN SGRRSY 

RPTHPDCQAVLQTLLSRTDLDPRVLSNTGWG 

QTQ1KQDTVWDIEEVPRPEGKSDKGTEGWES 

AATQTKNSGGWGDAPSQSNQMKSGWGELS 

ASTEWKDPKNTGGWNDYKNNNSSNWGGGR 

PDEKTPSSWNENPSKDQGWGGGRQPNQGWS 

SGKNGWGEEVDQTKNSNWESSASKPVSGWG 

EGGQNE1GTWGNGGNASLASKGG WEDCKRS. 

PAWNETGRQPNSWNKQHQQQQPPQQPPPPQ 

PEASGSWGGPPPPPPGNVRPSNSSWSSGPQPA 

TPKDEEPSG WEEPS PQSISRK.MDIDDGTSAWG 

DPNSYNYKNVNLWDKNSQGGPAPREPNLPTP 

MTSKSASDSKSMQDGWGESDGPVTGARHPS 

WEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSLKGGNNDSWMNPLAKQFSNMG 

LLSQTEDNPSSKMDLSVGSLSDKKFDVDKRA 

MNLGDFNDIMRKDRSGFRPPNSKDMGTTDS 

GPYFEKGGSHGLFGNSTAQSRGLHTPVQPLN 

SSPSLRAQVPPQF1 SPQ VSASMLKQFPNSGLSP 

GLFNVGPQLSPQQ1AMLSQLPQIPQFQLACQL 

LLQQQQQQQLLQNQRKISQAVRQQQEQQLA 

RMVSALQQQQQQQQRQPGMKHSPSHPVGPK 

PHLDNMVPNALNVGLPDLQTKGPIPGYGSGF 

SSGGMDYGMVGGKEAGTESRFKQWTSMME 

GLPSVATQEANMHKNGAIVAPGKTRGGSPY 

NQFDI1PGDTLGGHTGPAGDSWLPAKSPPTNK 

IGSKSSNASWPPEFQPGVPWKG1QNIDPESDP 

YVTPGSVLGGTATSP1VDTDHQLLRDNTTGS 

MQQT "MTQI P<!PHA WPV<3ACnMCVTKr\/HCTC AV 
rijOL,P* I OLrjiUnWr I onoUVHor 1 IN Vxlol oAJtv 

FPDYKSTWSPDP1GHNPTHLSNKMWKNHISS 

RNTTPLPRPPPGLTNPKPSSPWSSTAPRSVRG 

W GTQDSRLAS ASTWSDGGS VRPS YWL VLHN 

LTPQIDGSTLRTICMQHGPLLTFHLNLTQGTA 

L1RYSTKQEAAKAQTALHMCVLGNTTILAEF 

ATDDEVSRFLAOAOPPTPAATPSAPAAGWOS 

LETGQNQSDPVGPALNLFGGSTGLGQWSSSA 

GGSSGADLAGASLWGPPNYSSSLWGVPTVED 

PHRMGSPAPLLPGDLLGGGSDSI 


1049 


2399 


A 


874S 


200 


1387 


VPWKRQDEQL SLQ VETL YLDSPA VIHLLSPTF 

LPPSSLPPFLQ1VDSSSSACTLDSFFPFLAPWDS 

PQDCGFKDHQPLTLQALTVELARWTL MLLLS 

TAMYGAHAPLLALCHVDGRVPFRPSSAVLLT 

ELTKLLLCAFSLLVGWQAWPQGPPPWRQAA 

PFALSALLYGANNNLV1YLQRYMDPSTYQVL 



299 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
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seq- 
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peptide 
seq- 
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in 
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peptide 

sequence 


Predicted end 
nucleotide 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
u— Aspartic Acid, ii— uiutamic aciu, 
F=Phenyialanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L-Leucine, 
M=Methionine, N=Asparagine a P=Proline, 
Q=Gtutamine, R=Arginine, S^Serine, 
T=Threontne, V=Valine, W=Tryptophan, 
Y-Tyrosine, X=Unknown, *-Stop codon, 
/ — possiDic nucicuuuc ucxctiuu t \ — jjua^iui^ 
nucleotide insertion 














SNLKIGSTAVLYCLCLRHRLSVRQGLALLLL 

MAAGACYAAGGLQVPGNTLPSPPPAAAASP 

MPLHITPLGLLLLILYCLISGLSSVYTELLMKR 

QRLPLALQNLFLYTFGVLLNLGLHAGGGSGP 

GLLEGFSGWAALWLSQALNGLLMSAVMKH 

GSSITRLFWSCSLVVNAVLSAVLLRLQLTAA 

FFLATLLIGLAMRLYYGSR 


1050 


2400 


A 


8758 


3 


1660 


WVSSMGFEELLEQVGGFGPFQLRNVALLALP 

RVLLPLHFLLPIFLAAVPAHRCALPGAPANFS 

HQD VWLEAHLPREPDGTLSS CLRF A YPQ ALP 

NTTLGEERQSRGELEDEPATVPCSQGWEYDH 

SEFSSTIATESQWDLVCEQKGLNRAASTFFFA 

GVLVGAVAFGYLSDRFGRRRLLLVAYVSTLV 

LGLASAASVSYVMFAITRTLTGSALAGFT1IV 

MPLELEWLDVEHRTVAGVLSSTFWTGGVML 

LALVGYLIRDWRWLLLAVTLPCAPGILSLWW 

VPE S ARWLLTQ GH VKE AriR Y LLHC ARLN UK 

PVCEDSFSQEAVSKVAAGERVVRRPSYLDLF 

RTPRLRHISLCCWVWFGVNFSYYGLSLDVS 

GLGLNVYQTQLLFGAVELPSKLLVYLSVRYA 

GRRLTQAGTLL GTAL AFGTRLLVSSDMKSWS 

TVLAVMGKAFSEAAFTTAYLFTSELYPTVLR 

QTGMGLTALVGRLGGSLAPLAAJLLDGVWLS 

LPKLT YG GIALLAAGTALLLPETRQ AQLPETI 

QDVERKSAPTSLQEEEMPMKQVQN 


1051 


2401 


A 


8759 


515 


1625 


EIRTPVAVSSAPSGDSEGDEEETTQDEVSSHTS 

EEDGGVVKVEKELENTEQPVGGNEVVEHEV 

TGNLNSDPLLELCQCPLCQLDCGSREQLIAHV 

YQHTAAVVSAKSYMCrVLuKAL&bFOoLUK 

HLL1HSEDQRSNCAVCGARFTSHATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKDIAFSPPVYP 

AGILLVCNNCAAYRKLLEAQTPSVRKWALRR 

QNEPLEVRLQRLERERTAKKSRRDNETPEERE 

VRRMRDREAKRLQRMQETDEQRARRLQRDR 

EAMRLKRANETPEKRQARLIREREAKRLKRR 

LEKMDMMLRAQFG QDPSAMAAL AAEMNFF 

QLPV SGVELDSQLLGKMAFEEQNS SSLH 


1052 


2402 


A 


8763 


1106 


70 


RHGHGGRDRRGGGRVARPGGLGRYPGRGAA 

ASLVFVPTRRRSGPSGTASVAAMAYHSGYGA 

HGSKHRARAAPDPPPLFDDTSGGYSSQPGGY 

PATG ADV AFS VNHLLGUrMArJ VAMA Y U b bl 

ASHGFCDMVHKELHRFVSVSKLKYFFAVDTA 

YVAKKLGLLVFPYTHQNWEVQYSRDAPLPP 

RQDLNAPDLYIFTMAFITYVLLAGMALG1QK 

RFSPEVLGLCASTALVWWMEVLALLLGLYL 

ATVRSDLSTFHLLAYSGYKYVGMILSVLTGL 

LFGSDGYYVALAWTSSALMYFIVRSLRTAAL 

GPDSMGGPVPRQRLQLYLTLGAAAPQPUIY 

WLTFHLVR 


1053 


2403 


A 


8768 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVF 
YFTSSSVNS SA YTI YMGKDK Y ENEDLIKH G W 
PEDIWFHVDKLSSAHVYLRLHKGENIEDIPKE 
VLMDCAHLVKANSIQGCKI4NNVNVVYTPW 
SNLKJCTADM D VGQIGFHRQKDVKI VTVEKK 
VNEILNRLEKTKVERFPDLAAEKECRDREER 
NEKKAQIQEMKKREKEEMKKKREMDELRSY 
SSLMKVENMSSNQDGNDSDEFM 


1054 


2404 


A 


8769 


344 


527 


REATTLACRNSCWVFSRCSLGACKPTVCSMP 
SLSRQGSQTLCLRLAHYCMESVDSQRLLLS 
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seq- 
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SEQ ID 
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seq- 
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hod 
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AmtnA fiCln K*nti*»n***» ( A • == A laninp f" 1 — Pin4oinn 
.ruijuju M«iu ocLjucilliC \r\ — /Alanine \_< — wy Stein e, 

D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, HHHistidine, 
I=Iso leucine, FC=Lysine, L*=Leucine, 
M=Methionine, N=Asparagine, P-Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=*Tyrosine, X=Unknown, *=Stop codon, 
/^DOSsible nucleotide deletion V=nn<;<;thlp 
nucleotide insertion 


1055 


2405 


A 


8770 


430 


1104 


QQESPAAGAARMNCKEGTDSSCGCRGNDEK 

KMLKCVWGDGAVGKTCLLMSYANDAFPEE 

YVPTVFDHYAVTVTVGGKQHLLGLYDTAGQ 

EDYNQLRPLSYPNTDVFLICFSWNPASYHNV 

QEEWVPELKDCMPHVPYVLIGTQIDLRDDPK 

TLARLLYMKEKPLTYEHG VKX AKA TG A OPYT 

ECSALTQKGLKAVFDEAILTIFHPKKKKKRCS 
EGHSCCSII 


1056 


2406 


A 


8773 


261 


332 


NPRIQLSGNSCCAGSCRVWLSEQ 


1057 


2407 


A 


8778 


3 


477 


PAGIRHEQARGADRMGKCRGLRTARKLRSH 

RRDQKWHDKQYKKAHLGTALKANPFGGAS 

HAKGIVLEKVGVEAKQPNSAIRKCVRVQLIK 

NGKKITAFVPNDGCLNFIEENDEVLVAGFGR 
KGHAVfiDIPrJVPFK"VVk r \/A\lVQT i at wri r 

-iwji J_n. V \JLJ11 VJ V JvTXV V V JS. V rVIN V oi^JL/VL I JrvvJlV 

KERPRS 


1058 


2408 


A 


8808 


171 


881 


PGLSQEPSGSMETVV1 VAJGVLATIFLASFAAL 
VLVCRQRYCRPRDLLQRYDSKPIVDLIGAME 
TQSEPSELELDDVVITNPHIEAILENEDWIEDA 
SGLMSHCIAILKICHTLTEKLVAMTMGSGAK 
ivijvi v ouilv VAJtvKJorKVL/L>V VKoMYrPL 
r>Pl<*T T FiAPTTAT T T QVQI4T \n \rro xr a rui tv"> 

GLDWIDQSLSAAEEHLEVLREAALASEPDKG 
LPGPEGFLQEQSAI 


1059 


2409 


A 


8809 


246 


757 


MRLQGAIFVLLPHLGPILVWLFTRDHMSGWC 

EGPRMLS WCPFYK VLLLVQTA1 YS V VG YA SY 

LVWKDLGGGLGWPLALPLGLYAVQLTISWT 

VLVLFFTVHNPGLALLHLLLLYGLWSTALI 

WHPINKXAALLLLPYLAWLTVTSALTYHLWR 

DSLCPVHQPQPTEKSD 


1060 


2410 


A 


8810 


304 


381 




1061 


2411 


A 


8820 


1673 


848 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAA " 

FIFSYITAVTLHHIDPALPYISDTGTVAPEKCLF 

GAMLNIAAVLCIATIYVRYKQVHALSPEENVI 

IKLNKAGLVLGILSCLGLSIVANFQKTTLFAA 

HVSGAVLTFGMGSLYMFVQTILSYQMQPKIH 

GKQVFWIRLLLVIWCGVSALSMLTCSSVLHS 

GNFGTDLEQKLHWNPEDKGYVLHMITTAAE 

WSMSFSFFGFFLTYIRDFQKI SLRVEANLHGL 

TT YDTAPCPTMNF.RTRT I SRHT 


1062 


2412 


A 


8824 


1 


763 


GGAPPAS VPARESPVSGAQGS SRTRGHKRAA " 

GARAPQLCSSWQRRSAPAMSRGLQLLLLSCA 

YSLAPATPEVKVACSEDVDLPCTAPWDPQVP 

YTVSWVKLLEGGEERMETPQEDHLRGQHYH 

QKGQNGSFDAPNERPYSLKIRNTTSCNSGTYR 
CTLODPDGORNLSfrKVTT TtVTOCPAn&WVT 

ITGCYRAEIVLLLALVIFYLTLIIFTCKFARLQSI 

FPDFSKAGMERAFLPVTSPNKHLGLVTPHKT 
ELV 


1063 


2413 


A' 


8826 


147 


627 


CETSTSSAGHAPCRHAAQGPPAEPTGLRLCSE 

HQRLHAWPPGPRRPSLWPPKNGKWHSGKRT 

AGGRPQRRPSRRQSQRPSAWSGSPRMHSPGQ 

KCSLMCPHRSQDSLSTAIFQRSPGANTGRALH 

CVXSKEMKSVQRSLGLSRIHLQSKRKnHFVL 
TR 


1064 

1 


2414 


A 


8835 


2982 


1869 


LKDTL K SQMTQE A SDE AEDMKE AMNRMIDE 
LNKQVSELSQLYKEAQAE1 .FDYRKRKSLEDV 
TAEYIHKAEIiEKLMQLTNVSRAKAEDALSE 
MKSQYSKVXNELTQLKQLVDAQKENSVSITE 
HLQVITTLRTAAKEMEEKJSNLKEHLASKEVE 
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SEQID 

NU. Or 

nucl- 
eotide 
seq- 
uence 


SEQID 
NU. 01 
peptide 
seq- 
uence 


Met 
noc 


SEQ 

TP* XlfV 

ID NKJ. 

in 

USSN 

09/496 

914 


Predicted 
beginning 
nucleotide 
location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-AIanine OOysteine, 
L^Aspamc /\cia, £ == oiuiarnic/\Lia, 
F=Phenylalanine, G=Glycine, H=Histidine> 
Msoleucine, K=Lysine 9 I^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^jlutaminc, R^Arginine, S^Serine, 
T=Threonine a V=Valine, W=Tryptophan, 
Y"Tyrosine, X=Unknown, *=Stop codon, 
^possible nucleotide deletion, V^possible 

nitr*l^rttiH ^ tncf*i-tttf\n 














VAKLEKQLLEE1CAAMTDAMVPRSSYEKLQS 

SLESEVSVLASKLKESVKEKEKVHSEVVQIRS 

EVSQ VKREKENIQTLLK SKEQEVNELLQKFQ 

QAQEELAEMKRYSESSSKLEEDKDKKINEMS 

KEVTKLKEALNSLSQLSYSTSSSKRQSQQLEA 

LQQQVKQLQNQLAECKKQHQEVISVYRMHL 

K 


1065 


2415 


A 


8841 


3 


663 


AAATAASLSPRGCRLRTPSSDVGPSRAPPPSA 

APLPTGRAQMSPSGRLCLLTIVGLILPTRGQTL 

KDTTSSSSADATIMDIQVPTRAPDAVYTELQP 

TSPTPTWPADETPQPQTQTQQLEGTDGPLVT 

DPETHKSTKj\AHPTDDTTTLSERPSPSTDVQT 

DPQTLKPSGFHEDDPFFYDEHTLRKRGLLVA 

AVLFITGIIILTSGKCRQLSRLCRNHCR 


1066 


2416 


A 


8853 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFG 

RRRRRGRWSRKKMSLKSERRGIHVDQSDLL 

CKKGCGYYGNPAWQGFCSKCWREEYHKAR 

QKQIQEDWELAERLQREEEEAFASSQSSQGA 

QSLTFSKFEEKKTNEKTRKVTTVKKFFSASSR 

VGSKKEIQEAKAPSPSINRQTSIETDRVSKEFIE 

FLKTFHKTGQEIYKQTKLFLEGMHYKRDLSIE 

EQSECAQDFYHNVAERMQTRGKVPPERVEKI 

IVLLA^IEKYlMrRLYKYVrLrbl TDDLKKULAI 

QKRIRALRWVTPQMLCVPVNEDIPEVSDMW 

V ITTf WPPA^ AnnPT PTT 1VTVI TtffTWPPPl A^XII 

OYTTRFnsTP^PT K/TT fiFnfiYYFTMT PPAVAFTF 
\^ ill ivjt v^iNi oix-L/ivi i yjCiijyj i i r i in i_^v_.\^/ y v r\r ijc> 

KLDAQSLNLSQEDFDRYMSGQTSPRKQEAES 
WSPDACLGVKQMYKNLDLLSQLNERQERIM 
NEAKKLEKDLIDWTDGIAREVQDIVEKYPLEI 
KPPNQPLAAIDSENVENDKLPPPLQPQVYAG 


1067 


2417 


A 


8855 


1372 


1513 


SNMREVGCGWLVPV1PAFWEAEVGGSLEARS 
LRQAWATKQDPISKKK 


106S 


2418 


A 


8856 


1530 


1583 


PCRPGMECNSMISVHCNL 


1069 


2419 


A 


8857 


1530 


1583 


PCRPGMECNSMISVHCNL 


1070 


2420 


A 


8866 


293 


1675 


PYPQGGYPQGPYPQEGYPQGPYPQGGYPQGP 

YPQSPFPPNPYGQPQVFPGQDPDSPQHGNYQ 

EEGPPSYYDNQDFPATNWDDKSIRQAFIRKVF 

LVLTLQLSVTLSTVSVFTFVAEVKGFVRENV 

WTYYVSYAVFFISLIVLSCCGDFRRKHPWNL 

VALSVLTASLSYMVGMIASFYNTEAVIN4AVG 

11 lAVCr 1 V VlroM^ 1 KYUr 1 otMu VIA, V5>M 

VVLFIFAILCIFIRNRILE1VYASLGALLFTCFLA 
VDTQLLLGNKQLSLSPEEYVFAALNLYTDIINI 
FLYILTIIGRAKE*PSSSSLCPLRWHGWPGPCP 

TADTSIWTRCGHSMAPLVLPPPPRGTKATFPC 
HLLSTHCCMSPVCQPTPGTGGSTRSRGEGLSQ 
EVRVHVFPPVPAPQPGVEHPSPPPHPPGVLPS 
GHMR^fifil TPVl 9PF 


1071 


2421 


A 


8868 


2 


358 


ARGNTLYHLPRLCRKJLNLRWFSASTLYDVQH 
DDKMGSNTFFKRNDCRYVMISCKADMAYDN 
VRHPFM1* SIXKLIMEETYLNIIKA VYDRPTASII 
LNGEKLKVFPVRSGT*QGCSVWP 


1072 


2422 


A 


8870 


33 


658 


MESVLSKYEDQIT1FTDYLEEYPDTDELVWIL 
GKQHLLKTEKSKLLSDISARLWFTYRRKFSPI 
GGTGPSSDAGWGCMLRCGQMMLAQALICRH 
LGRDWSWEKQKEQPKEYQRILQCFLDRKDC 



302 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: oi 
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eotide 

O ^ /~1 

uence 


SEQID 
NO: oi 
peptide 
seq- 
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Met 

i i 

nod 
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ID NO: 
in 
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beginning 

nucleotide 

location 

corresponoi 

ng to first 

amino acid 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
io lasx amino 
acid residue 
of peptide 


Amino acid sequence (A^Alanine G=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, GKjIycine, H^Histidine, 
IHsoIeucine, K^Lysine, L=Leucine, 
ivi— Metnioninc, w— Asparagine, r=rroline» 
Q=Glutamine, R=Arginine, S=Serine 7 
T=Threonme, V=V aline, W=Tryptophan, 
i i yrubine, a— uriKnowri, ^oiop cooon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














CYSIHQMAQMGVGEGKSIGEWVLGPNTWAQ 

n\/*lfMT A\I FrWllA"h.TOT CI \/V\/OX/\rVKTT)0/™ , OT A 
Vj V^KiN L/AvLr IJo WuNaLVjLV Y V oM \D IN roO MA 

RFPKKLCRVLPUSADTAGLTGP 


1073 


2423 


A 


8879 


146 


412 


DFS V* GD VDIEVTCPICLQLLTEPLSLN CGLRL 

*QVCITA*IKESVnSGG*SSSPVCHTTFQPANL 

RTSRYLPT*SIKSLGPDEPQEG 


1074 


2424 


A 


8884 


67 


435 


HLQGRSIRTLQLTGENEKNCEVSERIRRSGPW 
KEISFGDYICHTFQGDCWADRSPLHEAAAHG 
RLLALKTLIAQGVNVNLWTL/DRVSSLHEACL 
* GP V ACAKPY WKM VPRHGGT VTGPPLLM V 


1075 


2425 


A 


8896 


1294 


248 


RSGDRNGLTHQLGGLSQGSRNQSYRSRSRSR 

SRERPSAPRG1PFASASSSVYYGSYSRPYGSDK 

PWPSLLDKJEREESLRQKRLSERERIGELGAPE 

VWGLSPKNPEPDSDEHTPN^EDEEPKKSTTSAS 

TSEEEKXKJCSSRSKERSKKRRKKXSSKRKHK 

KYSEDSDSDSDSETDSSDEDNKRRAKKAKKK 

EKJCKKHRSKKYKKKRSKKSRXESSDSSSKES 

QEEFLENPWKDRTKAEEPSDLIGPEAPKTLTS 

QDDKPLNYGHALLPGEGAAMAEYVKAGKRI 

PRRGEIGLTR*RNCHHLNAQVM**WSRHRR 

MEAVRTAKREPESTVLMRREPLHPFNPRRET 

KERE 


1076 


2426 


A 


8899 


146 


789 


GRSTEAEKEPAFDERTGKGRRLPRAGEFHG*E 

*APGPGPRSFQVSRKMPEE\PPGARKHPFSGKS 

FYLDLPAGKNLQFLTGAIQQLGGVIEGFLSKE 

VSYIVSSRREVKAESSGKSHRGCPSPSPSEVR 

VETSAMVDPKGSHPRPSRKPVDSVPLSRGKE 

LLQKA1KNQK* * CTVQQLSHCRLYXGEKTT AK 

RSQREHVQQQSQEHGKWPDLKGPR 


1077 


2427 


A 


8901 


352 


3 


AKIGA YKYIQELWRKKQ SDVMHFLLRVRCW 
Q YP ALHRAGTE W QL S ALHRAPRSTQPDKAC 
RLGYKAKQGYIIYRICVRRGGWKCPVPKAVT 
\ YGKPVHHG VN* LKFAQSLQS VAEEQ 


1078 


2428 


A 


8905 


536 


781 


ACPAENREVPEMAAG QAPHAGPGAGPGQPA 
PALPFAATPGSRGQALCRGGRRRQHLHGPLH 
RP*QAAPALHAGCQLAPHPPT 


1079 


2429 


A 


8912 


121 


376 - 


NLIWKLCVTERRLVILDNYDLASE/YEANKYI 
CNRIIQFKPGQDKYFTLGLPTGSTPL*CYPKLI 
EYNKNGHLSFKYVKTFSMDEY 


1080 


2430 


A 


8920 


381 


1788 


SSESPSDPGRMAMTWIVFSLWPLTVFMGHIG 

GHSLFSCEPITLRMCQDLPYNri'FMPNLLNHY 

DQQTAALAMEPFHPMVNLDCSRDFRPFLCAL 

YAPICMEYGRVTLPCRRLCQRAYSECSKJLME 

MFGVPWPEDMECSRFPDCDEPYPRLVDLNLA 

GEPTEGAPVAVQRDYGFWCPRELKIDPDLGY 

SFLHVRDCSPPCFhMYFRREELSFARYFIGLIS 

HCLSATLFTFVTFLIDVTRFRYPERPIKCYAV 

WFTMMVSLIFF\IGFLLEDRVACNA\SIPAQYKA 

STVTQGSHNKACTMLFMILYFFTMAGSVWW 

VILTITWFLAAVPKWGSEAIEKKALLFFIASA 
wnipon ttit i AMxnfiFnnMT^nvpFvni vn 

" vj 11 UlLfl Ul-ilj/ViVJLlN Jv11jVJL/1t| lOVJ V \_^r V VJL/ I l_y 

VDALRYFVLAPLCLYVWGVSLLLAGIISLNR 
VRIEIPL* KENQDKLVKFMIRIG VFSILYL VPLL 
VVIGCYFYEQAYRGIWEITWIQERC 


1081 


2431 


A 


8922 


56 


420 


EERTKMSTGPDVKATVGDISSDGNLNVAQEE 
CSRKGIVDEFFPLLSN*CIWTQPQGYPQSSYG 
TLANFVFVCSVRHGLALILQLCNFSIYTQQMN 
LS1AJPAMVNNTAPPSQPNASTERPST 


1082 


2432 


A 


8923 


355 


1079 


PFGTPSSTMAVVKNKCLMKGGKKGVKKKVV 
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seq- 
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SEQ ID 
NO: of 
peptide 
seq- 
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hod 


SEQ 
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in 
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914 
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beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
A=possible nucleotide deletion, V=possible 
nucleotide insertion 














GPFSKKDQYDVKAPAMFNIRNTGKTLVART 

QGTQIASDGLKGLLFEVSLADLQNDEVAFRK 

FKLITEDVQDKNCLTNFYGMDLTCDKICSMV 

EKWSTMIEAHVDVKTTDGYFFHLFCVGFTKK 

HNNQILKTSYA*HQQS/RQIQKKMMEIMT* EV 

QTNDLKEVVNKLIPDNIGKDTEKV/CPIYPLH 

DVFIRK VKMLENPGFERNMELRG GGS SS 


1083 


2433 


A 


8948 


28 


385 


LTWPQPHIPSCPAMSEETLQSKLAAAKKKLP 
WG A VQGSRAM SDLLLLLLDLTLLLLLMLLGF 
AGYSGQLAGVAVSAGSPPI/RYKFHVEPYGET 
GWLLT/ESCSISPKLCSIAVH*DNPAWF 


1084 


2434 


A 


8950 


156 


318 


HYTPINTDTIENSENNKCW*GY*E\VGLIHHW 
WGGKRVQPFWKRVWQKRTLNLRV 


1085 


2435 


A 


8956 


16 


413 


HMGQLGYFIQC W WECKRLISFVWKTI* QSPAK 

♦TIYTSYDTAIPIS/Gt^YPKRMSSKCHQETCAR 

MFILAPFTATIKGKQLTCPLVEERIDYXMWYS 

HKYYIKVKRNL*VTITH\TWVNLNILMFEIILW 

YSHKYY 


1086 


2436 


A 


8962 


868 


1026 


H* KILQVGRAQRAHXSRL* SQLLRRLRHESHL 
NPGARGCSEARLHRCTPAWTT 


1087 


2437 


A 


8985 


58 


330 


LHVKHLGHFQLVFSEVICHCILMPVS*ELQRL 
* ERS VCAFH VCIQTYVCLQVYACMCVY YICM 
FVYSVYGCGLCTCVCMDVY1CVCVQEFL 


1088 


2438 


A 


8989 


394 


404 


N* KWILH VNVRJQSIFF/IKRNQK/INSHELKLD 
KKFLDMMSNA*STKKHDKLD/LIKFKT/LCSA 
K YTV fCRI KIH PTDL EKMLRNIIL S DKD * Y S/G V 
YKDLSKLNRRKTE/S*/VKKWVKDLSRYFIKE 
VISMENKHKKIFSTS 


1089 


2439 


A 


8991 


60 


329 


MALTPESPSSFPGLAATGSSVPEPPGGPNATL 
NSSWDSPTEPSSLEDLEATGTIGTLLSDMGW 
G VEDN A YTLE VNSR YMRA VGIM* IHL 


1090 


2440 


A 


8996 


2 


351 


SNITITLT*MKKYDNTFCW*GCGQIG/T/LIYC 
W QESKFIQ AFW SKIQQ YLA* I SIHILFDP AFLFL 
GGYPGGTQSVFLTGVLVSSVFYNMKMLHTR 
LLIAALFIIVQYWKQSKDHYI 


1091 


2441 


A 


8997 


97 


456 


YPLPVCSYLSGPRGEHWNSLGGKSSCPLPLPT 
LVSSRFKISKVIWGDLSVGKTCLINR*GGAG 
AELGRVGPSLARWAGSRSQHLVPSQWCKDS 
FDKNYKAPIGADFEMERFEVLGIPF 


1092 


2442 


A 


8999 


548 


811 


SSFIKRHILIFEDD WHQTTCCHHPHHPXF* RCQ 

FHIFYVSVQNSISPSLSVSSSHPDRPDHEVHQH 

RAAIIHHQHGQGPLGHGLVARVG 


1093 


2443 


A 


9002 


3 


2745 


ALLGLQQPAQSLILSRSSVMGVRGLQGFVGS 

TCPHICTVVNFICELAEHHRSKYPGCTPTIWD 

AMCCLRYWYTPESWICGGQWREYFSALRDF 

VKTFTAAGIKLIFFFDGMVEQDKRDEWVKRR 

LKNNREISR1FHYIKSHKEQPGRNMFFIPSGLA 

VFTRFALKTLGQETLCSLQEADYEVASYGLQ 

HNCLGILGEDTDYLIYDTCPYFSISELCLESLD 

TVMLCREKLCESLGLCVADLPLLACLLGNDII 

PEGMFESFRYKCLSSYTSVKENFDKKGNIILA 

VSDHISKVLYLYQGEKKLEEILPIWTKQSSFL 

*RNGIISFTOT/nSlLHGFSKNPKV**LWTNK*YP 

RVQTPNPGKKFPCVQMLNPGKKFPCV QALNP 

GEKFPCIHI/PEPRQEVPTCSDPEPRQEVPTCTG 

PESRREVPMCSDPEPRQEVPMCTGPEPRQEVP 

MCTGPEARQEVPMCTDSEPRQEVPMCTDSEP 

RQEVPMYTGSEPRQEVPMYTGPESRQEVPMY 

TGPESRQEVLIRTDPESRQEIMCTGHESKQEV 
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Amino acid sequence (A=Alanine OCysteine, 
D^Aspartic Acid, E-Glutamic Acid, 
F=PhenyIalanine, G^GIycine, FNHistidine, 
l^Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
T^Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 








j 
i 

i 






PICTDPISKQEDSMCTHAEINQKLPVATDFEFK 

LEALMCTNPEIKQEDPTNVGPEVKQQVTMVS 

DTEDLKVARTHH VQAES YL VYNIMS SGEIECS 

NTLEDELDQALPSQAFIYRPIRQRVYSLLLED 

CQDVTSTCLAVKEWFVYPGNPLRHPDLVRPL 

QMTIPGGTPSLKILWLNQEPEIQVRRLDTLLA 

CFNLSSSREELQAVESPFQALCCLLIYLFVQV 

DTLCLEDLHAFIAQALCLQGKSTSQLVNLQP 

DYINPRAVQLGSLLVRGLTTLVLVNSACGFP 

WKTSDFMPWNVFDGKLFHQKYLQSEKGYA 

VEVL/CRTK*ISAHQIPQPEGSRLQGLHEGEQT 

HHWPSPLGLTPRREVGKTGLQLPQDGLWV 


1094 


2444 


A 


9021 


97 


834 


AREACRAKTDFPGRRFRLWPSCCCRVIVGAE 

T*H\KlAEPVSPLKJiFVLAKXAITAIFDQLLEFV 

TEG SHFVEATYKNPELDRIATEDDL VEMQG Y 

KDKLSIIGEVLSRRHMKVAFFGRTSSGKSSVI 

NAMLWDKVLPSGIGHITNCFLSVEGTDGDKA 

YLMTEGSDEKKSVKTVNQLAHALHMDKDLK 

AGCLVRVFWPKAKCALLRDDLVLVDGPGTD 

VTTELDSWIDKFCTKSSTREITNSGSDT 


1095 


2445 


A 


9022 


1 


537 


LVLNSRVEDFVPPEGAGRTJLPFALRPLAACW 

LLHRRARRSSALCPRPRSWGVSGGEGAGARE 

P* ITSSSCCLS AA/SHLSIQSPNMAG ARRRIRPQ 

LAKEKIEGCHICTSVTPGEPQVFLGKDKAFTF 

DYVFDIDSQQEQIYIQCIEKLIEGCFEGYNATV 

FAYGQTVGAGKTYTMGTGFD 


1096 


2446 


A 


9029 


1 


285 


FFFFNVCKSPKVPKPGCKEESTGTLFKNTLISL 
GQHSETPSLKKK\LAGYSGMCL*SQVLRRLRQ 
EDCLSPGGGNCRES * SCPYTPAWITERDPV 


1097 


2447 


A 


9032 


716 


357 


ARSTGFWGEILWCGFLKRSLALSPRVKCSGAI 
LAHCNFRHAGFPPLSCLSLPNRWEYRRPPARP 
GKFFLVFLVETGFQC/G*DGLDLLTSRSACLG 
LPKCWDYRREPAASIIFQTTFFINSK 




244o 


A 

A 


9038 


230 


652 


KWVMSCEDINISGSFYRNKLKYLAFLCKRTS 
TNPSQGPYHLWVPSHIFWQTTCGRLPHKTKQ 
G*AALDHLKVFDRIPLPYDKKKQMAVSATEE 
VVRPKP+RKFAYLGHWAQKVDWKYQAMTA 
TMGEKRKVYYQKICYQKK 


1099 


2449 


A 


9043 


185 


372 


IIFYSHQQCMR V^WQGCGDIETLIHCW* E*KII 
HSL/WK/TV*QFLKRLYLHLPHNSVIAFLGISP 
RKIKTCPQNSCTSMLINAlIiNDQKWKKINI 


noo 


2450 


A 


9045 


763 


584 


RQSLALSPRLECSGTISAHCRLCPLVFTPLSCL 
SLTSSWDYRRPPPHPANFLYFK*RRGF 


noi 


2451 


A 


9050 


275 


2 


LFFLRKVSNQFLSPSLLPVNFQGFVFAFLLLLL 
FLL/FEMESLPVA/RVECSGTISAHCNLCLPGSS 
DSPASAS+VAGITDMCRYTQLILFHAS 


1102 


2452 


A 


9053 


449 


1224 


KTSMFWKFDLHSSSHIDTLLEREDVTLKELM 

DEEDVLQECKAQNRKLIEFLXKAECLEDLVSF 

I\*EEPPQDMDEKIRYKYPNISCELLTSDVSQM 

JNiJKJLOJbJLJboLLMKljYSrLLNDSPLNPLLASFF 

SKVLSILI SRKPEQIVDFLKKKHDFVDLIIKHIG 

TSAIMDLLLRLLTCIEPPQPRQDVLNAVFKVQ 

RNL*HST*NVMDISKYVNLHWGLNKSHSLL* 

LLLQCVLQWLNEEKI1QRJLVEIVHPSQEEDVS 
SLV 


1103 


2453 


A 


9058 


403 


3 


GLHVYDFQVYREHILTLNVKKCSVSFWGLRE 
WLYLQMYEIIKSPRFPIIKMTDITKCW+GC\GA 
AGMQI/H/CW\WCVNVGKFWEMS*YYLLKLSI 
ST/PYDPAIPLLGIYL*ETRVY1HPKTCMRMLIA 
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SEQ ID 
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nucl- 
eotide 
seq- 
uence 


SEQ ID 

NU. 01 
pcpiluc 

seq- 
uence 
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U „ J 
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SEQ 

ID rsKJ. 

in 

USSN 
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correspondi 
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1 Ckti r\n 
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corresponding 
to last amino 
acid residue 
of nenttde 

sequence 


Amino acid sequence (A= Alanine C=Cysteine s 

LJ^r\b\jCU 11L /\CIU, Ei— VJJUldllllL rtClU, 

F=PVi*»nvln lonin/ 1 Ivntnp VI— T-fiotiH tnf* 
r — i jicii/ idloilliiv, u vJijrdiiCj n maiiuiiic, 

I=Isoieucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P= Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine_ V= Valine W=TrvotoDhan. 
Y~ Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














APFVLAVNC 


1 1 nd 




A 




75 


393 


KWLFSSI NITGRGDIIGHLKWLDCR\NCSSFPI 
KRNRQTHSTESNKLKAGHSFGYN*LIH*NSVV 
KTDCGCGANSKGVWVMKVXKTAOOKOTTS 
YMQIGTTKNSRAT 


1105 


2455 


A 


9065 


366 


778 


DLLILR3VLAFPELKRRNCISRFVXAYHLHKIYS 
RSILLCNNCSGFYILSL*QYDVFFFNYFFFRDR 
AWPCCPGWSAAWLTIVILAHYRRPGLERSCC 
LSLSSSWDHRRVPPCPANF*/YFSMGFTAFPRL 
VI NS*TOGI 

V J^i^lO a V/VJJ 


1106 


2456 


A 


9083 


673 


816 


ESGSLIH*WWENKPAQPLWWEI*QHVQKLPT 
HFPCDP AlPLLGI CPED 


1107 


2457 


A 


9086 


580 


18 


KPSSG SFIRAIYIFLSTAHVPALFS VL VRTKLT* 

AFSQSSVLWAHKQQKTSLSLVIR/ERLQIKTA 

VRENFLPIRLAKILKLDNVKCWQG/SGSNMSL 

T/HPWWPY>JVTHn\]^ < ;VTFPPT^VFHVYrTYA 
i/nL> yv we I jn v ii hi w iNo v irri\A v Hiri v 1 1 1 i /a 

peisvr*ihgglptlvhqe™tsvfrgapsvip 


1108 


2458 


A 


9093 


540 


1 


ggndcsvitttepgrkeit*krkf*ektdrlp 
ga/ppsrtpptpypcphgdrllppsrplpagpa 
safppaersrghrrasl*rarwsaavprrsa 

A<vFPVn«:T? WT RT PVH^TVIPPAVPVPVPPAP 
nSRPAAPfJSRI PDPGI DSPAPSRTPSSSVD+GG 

qrppppsgdslsppgccry 


1109 


2459 


A 


9099 


1255 


1425 


hesyhvnpnlcnpvaptsgahsig*kwps\vl 
gavahscnpstlvgrggritrgqelr 


1110 


2460 


A 


9103 


242 


70 


eeqffffavgmfp*vdflapasgelwdrlrlt 
csrpftrhqsfglaflrvcssldslddswgp 
sallssvl/nqggrnvleareaakhpti*rqs 
llrkqrnkrmaip 


1111 


2461 


A 


9110 


189 


121 


sflsvrlecngaimahcalplpg 


1112 


246z 


A 


y] 15 


1 AA 

100 


Ol A 

yio 


aaagdpasldfaqclgyygyskfgnnnnym 
nmaeannaffaaseqtfhtpslgdeefeippit 
pppesdpalgmpdvllpfqalsdplpsqgseft 
pqfppqsldlpsitisrnlveqdgvlhssglhm 

nn^HTnv^nYPnnpsi TMR^^^^PDAARsn 

vmppaqlttinqsqlsaqlglnlggasmpht 
spsppasksatpspsssineedadeanraigek 

R A A PDSGK KPfCTPKK 


1113 


2463 


A 


9120 


3452 


3051 


FLRPSFAX VPQ AG VQWCALS WLQPPSPRPK* F 
SCLSLPSSWDYRHVPPRPAKFFVLLVETGFLH 
VGQAGHEPLTSGDPPASASQSAGITGVSHQA 
WPSFFIFSRDTVLLCCSGWSRTSGLKQSACLS 
T I KrWDY 


1114 


2464 


A 


9122 


152 


377 


NQLPLQQWTFFIYETGFCSVAQAGVQCRDHS 

SLHP*PPG\SSDPPAPPS*VLGITGQRYHACLI1 

YLYVQTVPQRV 


1115 


2465 


A 


9124 


553 


981 


QRPLLRQQLGSWPTCRSLEGDLASPW**RLPG 
SPRMRRSGT/ATLNLPLSPQGTVRTAVEFQVM 
TOTOSI ,SFLLGSSASLDCGFSMAPGLDLIS VE 
WRLQHKGRGRGDLHLPDHHLS VPS S ADHPA 
QQPSQFNGRNLYFLPLFR 


1116 


2466 


A 


9135 


48 . 


410 


SASHEPAEHDGGADSLSASQPPRPAGRPAGA 
QITVHVPPWTDVLAGQDRRAPTAGDGAPWP 
APGGHVPSTRPHDPAEFHADEAAGRGGRGLQ 
PAAPHALPAGLPHGPPAPA/PAEGGGTP*GSA 
GAGGP* GSPAGRACGAAGCRPRPPRPAASSA 
*NSAGS* GLVEGT*PPGAGHGAPSPAVGARLS 
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Amino acid sequence (A^Alanine C = Cysteine, 
D=Aspartic Acid, E=GIutamic Acid, 

F=Ph#»nvl alanine C\— CXYurm* 14— UJctM \ nt* 
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Y=Tyrosine, X=Unknown, *=Stop codon, 

/—DOSSlble nucleoli He delptinn \=nr.ccihlp 

nucleotide insertion 














CPARTSVQGGTWTC*APAGRPAGLGGWEAE 
RES APPSCSAGS*DAD* GAEP WGAGSRSWGS 


1117 


2467 


A 


9141 


380 


939 


KSGHWAKECLQPRIPPRPCPICVGPHWKSDCP 

TPPfiAVPR APHTI Pn^tQI TT^^FPnT T QI VAcn 

*CCLMASEASWTmELWVTLTVEGKSVP/CL 
NTEATHSTLPSFQGPVSLASITWGIDGQASKP 
LKTPQLWCQLGQYSFMHYFLVIPTCPVPLLG* 
GILTKLSAFLTIPRLQPHLIAALSPSS 


1118 


2468 


A 


9154 


471 


2 


AAGQWVEVTSHLYLCITSDAAGLRLLPPAES 

ERGEGGHCPAEAPLPPRPQYCLAKHPLLRKLP 

EEKIKLDPYLTQHTKINSKQIKYLS/VRAKTTQ 

LVEGNIGVNLQNTELKQH*INGFLDTTPEAQE 

TKEKTNKLNFIKKVKRQLAEWEKIFQIA 


1119 


2469 


A 


9155 


2 


3187 


ACPRLARRRRRVRSLRJRRRGWLRARWSRGQ 

NNMAARRJTQETFDAVLQEKAKRYHMDASG 

EAVSETLQFKAQDLLRAVPRSRAEMYDDVHS 

DGRYSLSGSVAHSRDAGRESLRSDVFSGPSFR 

SSNPSISDDSYFRKECGRDLEFSHSNSRDQVIG 

HRKLGHFRSQDWKFALRGSWEQDFGHPVSQ 

ESS WSQEY SFGPSAVLGDFGSSRLIEKECLEK 

ESRDYDVDHPGEADSV/LRGGSQVQARGRAL 

NIVDQEGSLLGKGETQGLLTAKGGVGKLVTL 

RNVSTKKIPTVNRITPKTQGTNQIQKNTPSPD 

VTLGTNPGTEDIQFPIQKIPLGLDLICNLRLPRPv 

KMSFDIIDKSDVFSRFGIEIIKWAGFHTIKDDIK 

FSQLFQTLFELETETCAKMLASFKCSLKPEHR 

DFCFFTIKFLKHSALKTPRVDNEFLNMLLDKG 

AVKTKNCFFEHKPFDKYIMRLQDRLLKSVTP 

LLMACNAYELSVKMKTLSNPLDLALALETTN 

SLCRKSLALLGQTFSLASSFRQEKIL*AVGLQ 

DIAPSPAAFPNFEDSTLFGREYIDHLKAWLVS 

SGCPLQVKKAEPEPMREEEKMIPPTKPEIQAK 

APSSLSDAVPQRADHRWGTIDQLVKRVTEGS 

LSPKERTLLKEDPAYWFLSDENSLEYKYYKL 

KLAJEMQRMSENLRGADQKPTSADCAVRAML 

YSRAVRNLKKKLLFAWQRRGLLRAQG\LRG\ 

WKARRA\TTGTQTLLFLRAPGLKHHGRQAPG 

LoyAl^iLrUKNUAAi^uPPDPVGPSPQDPSL 

£Ao urorkr/iu V Ul -iCAJr \£ I o orLioAJUl JJMK. I 

METAEKLARFVAQVGPEIEQFSIENSTDNPDL 
WFLHDQNSSAFKF YRKKVFELCPSICFTS SPH 

>JT HTOGGnTTfi^ffF^PVT-T MFflFAFFFrvFPP 

PREAELESPEVMPEEEDEDDEDGGEEAPAPG 
GAGKSEGSTPADGLPGEAAEDDLAGAPALSQ 
ASSGTCFPRKRISSKST KVfJMTPAPlTR VCT THF 

PKGECPPVGTVASSTVLGWWAVRVRRDRWR 
HFNPKEFCAPLQNVSRHSCFPW 


1120 


2470 


A 


9163 


124 


207 


PPRACRPCPRACPCPPT*KCSQPVSWPC 


1121 


2471 


A 


9166 


272 


523 


PM SSLQGCFYTFKCIIFKGIFLLLISNLIAF* * EK 

V/CSHITDSLKFIGKGWVGMVTHACNPGTLG 

G*GGWIA*VREFETSLGNM 


1122 


2472 


C 


9170 


442 


236 


MNRRRFLRPADCHSGMRGTENGACSEGESQI 
HCGAGGEGVQLVHVVNQPENGCLQFDSTHIT 
FSKRQN* 


1123 


2473 


A 


9171 


10 


423 


MVDRSPLLTSVIIFYLAIGAAIFEVLEEPHWKE 
AKKNYYTQKLHLLKEFPCLGQEGLDK1LEVV 
SDAAGQGVAITGNQTFNNWNWPNAMIFAAT 
VITTIGYGNVASKTPGGRLFCGFYGLFGVPFC 
LTWINALGKFFG 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 


1124 


2474 


A 


9173 


3 


374 


GPSPSLLVLLPQEPGGTGTPVRAGAGAGMWL 
WFDOGG1 LGPFSFLMI Ml 1 1 FTRNPVNATl 1 
TGS1 J 7 VLL G VFSFEP V P S CRALO FLKPRDR I S A 
I AHRGGRHDPPENTLG AIR/QGS* * WSNRR 


1125 


2475 


A 


9179 


704 


188 


ESSSGLLFQCFQGIHVQKLTLQARPTLFSWWL 
CSKPPKETGELENAESGGDGGRRGGKQDNV 

LPMGFFYLYFRDPGREITWKHFVQYYLARGL 

VDRLEWNKQSVRVIPAPGTSSEVRGEFKAE 

YCRHKFISCKNVVFYFFQ 


1126 


2476 


A 


9183 


153 


233 


MEYMAESTDRSPGHILCCECGVPISPN 


1127 


2477 


A 


9185 


1 


321 


LTGQLGSILLRVFSKSRAGLGARKLKAYRTM 
EYMAESTDRSPGHILCCECGVPl SPNPAQY\CV 
ACLRSSFHIYHCIPKLFIHPFSKTSSSAFITPSHY 

1 TT7T7CTIC 


1128 


2478 


A 


9186 


183 


847 


VLKFLLLQTMDEQSQGMQGPPVPQFQPQKAL 
RPDMGYNTLANFR1EKKIGRGQ\FSEVYRAAC 
L^JLDGVPVALKXVQIFDLMDAKARADCIKEID 

LLJvv^LNrlrN V IK I X AorlJ^lJlNJtL.INl VJLJcJjAUA 
GDLSRMIKHFKKQKRLIPERTVWKYFVQLCS 
ALEHMHSRRVMHRDIKPANVFITATGVVKXG 
DLGLGRFFSSKTTAAHSLVGTPYYMSPERJHD 

MP. 
INVj 


1129 


2479 


A 


9190 


1 


370 


GTSWKIPSAAVSESSPNGAAYASGLPCGVRG 
PPWAGLALLPSPTLMALLRRPTVSSDLDNIDT 
RATT\KIRVVATITRARIEDMRHSATALTRPD 
ATTAQIPKLPVTTVCNRRANPGIPPSVL 


1 1 j\j 


o /ton 


A 

A 


y iy4 


1 "2 1 
I J 1 


a On 


AVI VDI DVDCCITrrADT T\/CI71X7I T> T I f" 1 \/ 

AYLKKLrVJ^JbMl LirAKL-1 vbb WLKJJLrrLLi V 

LALLGYLAVRPFLPKKKQQKDSLINLKIQKEN 

PKVVNEINIEDLCLTKAAYCRCWRSKTFPAC 

DGSHNKHNELTGDNVGPLILKKKE 


1131 


2481 


A 


9201 


184 


605 


KELVDEKSERGRAMDPVSQLASAGTFRVLKE 

PLAFLRALELLFAIFAFATCGGYSGGLRLSVD 

CVNKTESNLSIDIAFAYPFRLHQVTFEGVPTCE 

GKERHKLALIGDSSSSAEFFGTVAGFAFLYSL 

AATGVYIFFQNKY 


1132 


2482 


A 


9206 


1 


852 


GCX5RAGAGSRDMGSTDSKLNFRKAVIQLTTK 

Tl^DX /I? A TTPlT'V A T?M7'Pkf' > H?AX7 A WT A TO\7An\/'C A T \/ 

1 ^rVtAlULIArWiJyr WAJJi A 1 is V^UVr AJ_,V 

PAAEIRAVREESPSNLATLCYKAVEKLVQGA 
ESGCHSEKEKQIVLNCSRLLTRVLPYIFEDPD 
WRGFFWSTVPGAGRGGQGEEDDEHARPLAE 

SLDSCEYIWEAGVGFAHSPQPNYIHDMNRME 
LLKLLLTCFSEAMYLPPAPESWQH/RTHWFSS 
FVSSENRHALPLFTSLLNTVCAYDPVEYGIPY 
NHLY 


1133 


2483 


A 


9208 


1165 


1463 


GPRARVQGFSGADIVKFMALGSMYLVLTLIV 
AKVLRGAEPCCGPLKNRVLRPCPLP/VPLPPP 
HPQPSRGNPVGCLPTYKWYKLLSWPLHSNS 
NVYFIV 




Z4o4 


A 

A 




DO 


1 JoO 


\A A.fl AriPVP"P A I Q A PV AT?Flf PFAR'FK'TK/f A ATf 

RADGAAPAGEGEGVTLQGNITLLKGVAVIW 

AIMGSGIFVTPTGVLKEAGSPGLALWWAAC 

GVFSIVGALCYAELGTTISKSGGDYAYMLDV 

YGSLPAFLKLWIELLHRPSSQYIVALVFATYL 

LKPLFPTCPVPEEAAKXVACLCVLLLTAVNC 

YSVKAATRVQDAFAAAKLLALALIILLGFVQI 

GKGDVSNLDPNFSFEGTKLDVGNIVLALYSG 

LFAYGGWNYLNFVTEEMINPYRNLPLAIIISLP 
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INTTLVYVLTNLAYFTTLSTEQMLSSEAVAVDF 

GNYHLGVMSWI1PVFVGLSCFGSVNGSLFTSS 

RLFFVGSREGHLPSILSMIHPQLLTPVPSLVFT 

CVMTLFYAFSKDIFSVINFFSFFNWLCVALAII 

GMIWLRHRKPELERPIKVNLALPVFFILACLF 

LIAVSFWKTTPWSVASDFTIILSGLPVYFFGV 

WWKNKPKWAPPGHLSPRPSCVRSSCMVVPQ 


1135 


2485 


A 


9216 


40 


410 


RDRLPPAYFCRPVVCVVTALDVG\SPESQEM 
DLVAFEDVAVNFTQEEWSLLDPSQKNLYREV 
MQETLRNLASIGEKWKDQNIEDQYKNPRNNL 
RSLLGERVDENTEENHCGETSSQIPDDTLNK 


1136 


2486 


A 


9223 


3 


983 


RRRRRSRYRRCSRFPRPGPLAVSMPHAFKPG 

DLVFAKMKGYPHWPAR1DDIADGAVKPPPN 

KYPEFFFGTHETAFLGPKDLFPYDKCKDKYGK 

PNKRKGFNEGL WEIQNNPHASYS APPPVS SSD 

SEAPEANPADGSDADEDDEG\RGVMAVTAVT 

ATAASDRMESDSDSDKSSDNSGLKRKTPALK 

MSVSKRARKASSDLDQASVSPSEEENSESSSE 

SEKTSDQDFTPEKKAAVRAPRRGPLGGRKKK 

APSASDSDSKADSDGAKPEPVAMARSASSSSS 

SSSSSDSDVSVKKPPRGRKPAEKPLPKPRGRK 

PKPERPPSSSSSD 


1137 


2487 


A 


9229 


21 


239 


LFPRLECRDPVTVNCTLNLPGSKNAPTTASQV 
GSTWNYRGGLPHPTNFFVKTGFRCSQAGLKL 
RGSREPPAWA 


1138 


2488 


A 


9231 


1664 


2 


TRSVGVNTCEVGWTEPECLGPCEPGTSVNL 

EGIVWHETEEGVLVVNVTWRNKTYVGTLLD 

CTKHDWAPPRFCESPTSDLEMRGGRGRGKR 

ARSAAAAPGSEASFTESRGLQNKNRGGANGK 

GRRGSLNASGRRTPPNCAAEDIKASPSSTNKR 

KNKPPMELDLNSSSEDNKPGKRVRTNSRSTP 

TTPQGKPE'lTFLDQGCSSPVLIDCPHPNCNKK 

YKHINGLRYHQAHAHLDPENKLEFEPDSEDK 

ISDCEEGLSNVALECSEPSTSVSAYDQLKAPA 

SPGAGNPPGTPKGKRELMSNGPGSIIGAKAGK 

NSGKKKGLNNELNNLPVISNMTAALDSCSAA 

DGSLAAEMPKLEAEGLIDKKNLGDKEKGKK 

ANNCKTDKN\PSKXKSARPIAPAPAPTPPQLIA 

JDPTATFTTTTTGTIPGLPSLIU'IVVQATPKSPPL 

KPIQPKJH'IMGEPITVNPALVSLKDKXKKEKR 

KLKDKEGKETGSPKMDAKLGKLEDSKGASK 

DLPGHFLKDHLNKNEGLANGLSESQESRMAS 

IKAEADKVYTFTDNAPSPSIGS 


1139 


2489 


A 


9234 


207 


443 


TRRGQPWRRRAAAAGJLPGREAAACLPSC/AS 

VTAAVSGLLVGYELGIISGALLQIKTLLALSC 

HEQEMGVSSLVIGALL 


1140 


2490 


A 


9238 


248 


328 


MAQGNNYGQTSNGVADESPNMLVYRKV 


H41 


2491 


A 


9242 


2 


535 


FVEAAVKMLGSLVLRRKALAPRLLLRLLRSP 

TLRGHGGASGRNVTTGSLGEPQWLRVATGG 

RPGTSPALFSGRGAATGGRQGGRFDTKCLAA 

ATWGRLPGPEETLPGQDSWNGVPSRAGLGMV 

WPWAAALWHCYSKSPSNKDAALLEAARAQ 

VNMQEVSRNRCALLHSAAVQEYGYGN 


1142 


2492 


A 


9245 


157 


466 


HLCFWFFVGLFLPEQQIMLFA'ELLRMAQGCD 
FALGNDFLMTTKAQA/TKEKLDKLDFIKIKTC 
CTSMDAIEKTEPLTKWTKAFVSHVSYKRLLF 
GICKEYSRQ 


1143 


2493 


A 


9247 


264 


115 


GLPQQTSTIQPPGTPDGARDFTSTIQPPGAPDG 
ARDSTSIIRMGPEIPPP 
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nucleotide insertion 


1144 


2494 


A 


9260 


1 


401 


KKVPGRLSEMSFSLNFTLPANTTSSPVTVDCGP 
SLGLAAGIPLLVATALLVALLFTLIHRRRSSIE 
AMEESDRPCEISEIDDNPKISENPRRSPTHEKN 
TMG AQE AHI YVKTV AG SEEP VHDRYRPTIEM 
ERRR 


1145 


2495 


A 


9264 


175 


411 


METIWIYQFRLIEIGDSTVGKSCLLHRFTQGRF 
PGLRSPACDPTVGVDFFSRLLEIEPGKRIKLLL 
WDTAGQERFI SIT 


1146 


2496 


A 


9277 


592 


814 


MFTYLEGREGIKSQPKMEPHSVT\RLECSGMI 

SAHCSLNLPGTSDSPASASR/VAGTTGMRHHA 

WLIFAFLVETGF 


1147 


2497 


A 


9279 


1255 


2 


FRRGRRGEEEKEEEEEEEEG WVN GMEN SHPP 
HHHHQQPPPQPGPSGERRNHHWRSYKLMIDP 
ALKKGHHKLYRYDGQHFSL AM SSNRPVEI VE 
DPRVVG1WTKNKE\LELSVPKFKIDEFYVDQV 
PPKQVTFAKLNDNIRENFLRDMCKKYGEVEE 

i 7T7TT VXTDVTVVUI riATVUPATV/PrJAK'nAVn 

HLHSTSVMGNITHVELDTKGETRA'IRFYEL\LV 

TGRYTPQTLPVGELDAVSPIVNETLQLSDALK 

RLKDGGLSAGCGSGSSSVTPNSGGTPFSQDTA 

YSSCRLDTPNSYG/QGTPLTPRLGTPFSQDSSY 

SSRQPTPSYLFSQDPAVTFKARRHESKFTDAY 

NRRHEHHYVHNSPAVTAVAGATAAFRGSSD 

LPFGTVGGTGGSSGPPFKAQPQDSATFAHTPP 

PAQATPAPGFR 


1148 


2498 


A 


9302 


1026 


6 


IASIQNADTMPGVGLLVSHFSTLVSRQRCPNY 
ADPQNLTDVSIFLLLEVSGDPELQPVLAGLFL 

Cfcrf/^T \/T"WI /"•"Ml 1 TT1 ATQPHCm MXPIV/IVTTFFCM 

bMCL V l.VL\jNL f LllL*Alk>i L/orUUril rlvl i rrrOiN 

LSLPDV\GFTSTTVPK\MIVDI\QSRSRVISYAG 

CLTQKSLFAIFGGTEE\NMLLSVMAYDRFVAI 

CHPLYHSAIMNPCFCAFLVLLSFFFLSLLDSQL 

HSWIVLQFTIIKNVEISNFVCDPSQLLKFACSD 

SimSIFIYFHKDPERQLVLAGLFLSMCLVTVL 

GNLIIILDVSPDSHLPTPMYFFLSNLSLPDIGFT 

STTVPKMIVDIQSHGRVIFYAGCLTQMSLFAIF 


1149 


2499 


A 


9303 


1 


699 


MASQEKDFIGWGTIHLFRKPQRSFFGKLLRE 

CUT VA ATYRQK/KTRVMT FflVTMI TPTCIFT T MWC 

SSTNSIAL'ASYTYLTIFDLFSLMTCLISYWVTL 

RKPSPVYSFGFERLEVLAVFASTVLAQLGALF 

it ICF9AFRF1 EOPEIHTGRLLVGTFVALCFNLF 

TMLSIRNKPFAYVSEAASTSWLQEHVADLSR 

SLCGIIPGLSSIFLPRMNPFVLIDLAGAFALCIT 

YMLIEI 


1150 


2500 


A 


9308 


797 


693 


DRSTSVTRAGVQWCSLGSLQPRTPGLLRSSCL 
<sT P 


1151 


2501 


A 


9309 


205 


406 


VAIKELP VLWKW SKPTR\TAKEPPQTQQRAG 
SKTAAPPCQW SRMA SEGPNIPCPG ARHSDKQ 
FLICTI 


1152 


2502 


A 


9314 


913 


504 


KPSPLITPPAVVLPPSAVLNLVKTFSSFPQVEV 
QGPLCGPRKGRLAVTIPFFGLS/LPKYMDHRR 
PPPHR\EIFFVFLAETGFHRASQAGPDLPTS/SA 
PPTSA/FPKCWEYRSEPQCLPGCLSFSGILLDL 
GTNVSLRAA 


1153 


2503 


A 


9315 


392 


I 


HPHRPRPGFRSPARSSRPCPVLTSLLPPFPSPSP 
PADDLVKAGRDRKDPQVR/ERRLRPNPGRLG 
GPR\PRPARARS/CHQPRLTRVCPRSPPPEARA 
P AP AAP ARGRG APKRNRPRTDTRAPRG S SAR 
PGNS 
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D=Aspartic Acid, E=Glutamic Acid, 
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1154 


2504 


A 


9321 


331 


433 


MPCI/QAQYGTPAPSPGPRDHSASDPLTPEFIK 
PT 


1155 


2505 


A 


9324 


180 


275 


MEEPQSDPSVEPPLSQETFSDLWKLLSENNVL 


1156 


2506 


A 


9326 


383 


619 


MISPSRTEGDPLPLPP/EGEGQEVRGFGGGPAK 

EAAQRHCRASVSILRMRRPGQGSSRPARVPL 

RGPDSHRLREPPPSPP 


1157 


2507 


A 


9327 


152 


292 


YERRGRSQGGG SHPAGAQPGGRAIGAG WQS 
KEPLWEGLQRSGSPLPG 


1158 


2508 


A 


9328 


1 


430 


QELKQGPNPLAPSPSAPSTSAGLGDCNHRVD 

LSKTFSVSSALAMLQERRCLYVVLTDSRCFL 

VCMCFLTFIQALMVSGYLSSVITTIERRYSLKS 

SESGLLVSCFDIGNLVWVFVSYFRGRRRRP/ 

RVAAVGGLLDLEGGEMI 


1159 


2509 


A 


9334 


108 


383 


KGNQVNGNGNQLKRKHESMCPVSLTQNTVR 
LMEAGLPQKQAERADELFEAGLVIYVKLDER 
VLNAUYSSVGLQWFKESDLSHLRLLEISFR 


1160 


2510 


A 


9338 


2 


430 


FVGRPRGLSDRLEDLFLAGFRVGERLRTAAM 
KRYVRILLLGEGAEHVADPVPGGRGVPRGEA 
DHTDQELREEIHKANV ERVVHDVSQEATIE1CI 
RTKWIPLV/RWGDHA/EGPVGIKSYLPSGRSM 
EAELPIMSQLTEIETCVEC 


1161 


2511 


A 


9341 


1 


390 


NSRVDDFVAPGLSEAGKLLGLEFPERQRLAA 

AVG/CSPMSGVISMSAPFFLGKIIDAIYTNPTV 

DYSDNLTRLCLGLSGVFLCGAAANAIRVYLM 

QTSRQRVVKRLRTSLFSSILGQEVAFSDKAGT 
GELJ 


1162 


2512 


A 


9343 


84 


837 


QGRFRAFCWQRDFLQPPGMRLSALLALASKV 

TLPPHYRYGMSPPGSVADKRKNPPWIRRRPV 

WEPISDEDWYLFCGDTVEILEGKDAGKQGK 

VVQVIRQRNWVVVGGLNTHYRYIGKTMDYR 

GTMIPSEAPLLHRQVKLVDPMDRKPTEIEWR 

FTEAGERVRVSTRSGRIIPKPEFPRADGIVPET 

WIDGPKDTSVEDALERTYVPCLKTLQEEVME 

AMGIKETR\NTRRSIGIEPGAEQLLPNFCPSLE 

G 


1163 


2513 


A 


9346 


967 


616 


DSLALSPRLECSGAISAHCNLTPPGFTPFSCLS 
LPSSWAYRCASPHPDNFFVFLVESGFHHVGQ 
AGLKLLISSDPPTSA/FPKCWDYRRD\SSAPAT 
FSSYQRNNPDLILNDTIMPNIK 


1164 


2514 


A 


9347 


3 


1099 


SSFPTCMRTV FHSNTS VSSLLHRPGHVTPQLT1 

HGGWRHHRDHTAIDEWDFNPSKFLIYTCLLL 

FSVLLPLRLDGIIQWSYWAVFAPIWLWKLLV 

VAGASVGAGVWARNPRYRTEGEACVEFKA 

MLIAVGIHLLLLMFEVLVCDRVERGTHFWLL 

VFMPLFFVSPVSVAACVWGFRHDRSLELEILC 

SVNILQFIFIALKLDRHHWPWLVVFVPLWILM 

SFLCLVVLYYIVWSLLFLRSLDVVAEQRRTH 

VTMAISWITIVVPLLTFEVLLVHRLDGHNTFS 

WSIFVPLWLSLLTLMATTFRRKGGNHWWF 

AIRRDF/CQDQLPQPTGKPPPPPLTDHHGEKA 

LPLQNKDRGSWPASRGSPRLL 


1165 


2515 


A 


9362 


547 


991 


DVSIGPPLLRRPCSGREQTRSLSFPSDPESSFSP 
VPEGVRLADGPGHCKGRVEVKHQNQWYTV 
CQTG WS LRAAK WCRQLRCGRA VLT\QKRC 
TKHAYGRKPIWLSQMACSGPEPTLHDCPFRP 
LGEDTLFHVEYTSVHGRERLSAKD 


1166 


2516 


A 


9363 


201 


387 


PP1LRWTPPSGKNFFFFFFFESEFY/S SPRVECS 
GAISAHLAHCNLCLPGSSDSPASAFQVAS 


1167 


2517 


A 


9368 


707 


1087 


AVLTPCLSPCSPSRIPRPVSRPYPGRRSLSHTPP 
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PRPLIL Y AP AP\RP AGT AF1PH SHPPPPULL RPT 

ATPA/TPCPSLPPPPRPLHPTQPSTALLPDPPPW 

PLPFPPPSS/RPPRPDCSTSYSPTFPPPT 


1168 


2518 


A 


9375 


511 


15 


MMLSEETSAVRPQKQTRFNGAKLVWMLKGS 

PITVTSAVIIVLMLUVIM/IFSPWLATHDPNAID 

LT ARLLPPS AAH WFGTDEVGRDLFSRVL VG S 

QQSILAGLVVVATTGMIGSPLECLFGELGGRA 

DA1FMRVMDIMRS/IPSLVLTMEKTAALGPSL 

FNAMQASSEH 


1169 


2519 


A 


9377 


42 


410 


GNGRVAPRDPGAVASAEPGLTTHDSGVNPN 
NSARRMEAMASGSNWLSGVNVVLVMAYWS 
LVFVLLFIFAKRQIMRFAMKSLRGPHGPVGH 
NAPKDLKEE IDILLSRVHNIK YEPVHLLADDDA 


1170 


2520 


A 


9378 


302 


1303 


GVSGFSASVLRQRRMEDELEPSLRPRTQIQGR 

ILLLT1CAAGIGGTFQFGYNLSIINAPTLHIQEF 

TNETWQARTGEPLPDHLVLLMWSLIVSLYPL 

GGLFGALLAGPLAITLGRKKSLLWNNIFVVS 

AAILFGFSRKAGSFEMIMLGRLASWGVNAGV 

SMNIQP\MLPGGESAPKELRGAVAMSSAIFTA 

LGIVMGQVVGLSTTAATGLRGLAAGELEELEE 

ERAACQGCRARRPWELFQHRALRRQVTSLV 

VLGSAMELCGNDSVYAYASSVFRKAGVPEA 

K1QYAIIGTGSCELLTAVVSVSLEGALPPPAL 

WGGTPRSFALNQFTLQKKKK 


1171 


2521 


A 


9381 


2 


412 


RGPASAQEDERARTAPLERVRARGRMTTSSA 

LFPSLLPCSWSTSNKYLAEFRAGICMSLKGTTE 

TPDKRKGLAYAQQTDDSLIHFCWKDRTSGNV 

EDDLIIFPDDCEFKRLPQCPNGRVYVLKFKAG 

SKRLFFWMQEP 


1172 


2522 


A 


9384 


20 


355 


G W NG RSTE A SP A AE APH VPHKETVKA AM GTQ 
CrHGGKVRPDPHDML'ITWHKIKLFVLCHSL 
LQLCAIMISDYLKSSIYTVEKRLGLFRPTSGLL 
ASFNEVGNTALIVLESY 


1173 


2523 


A 


9393 


430 


87 


lcqcivpgqqketfslnpssatvrfyl+lslq 
qrkedq*ul*yhlnkdclhifmsaitlymk:i* 

KIFVLFDFNIMFETPFYI1+FIFLFSQNLKRJRQV 
IRPPISFSKINNGP 


1174 


2524 


A 


9397 


77 


374 


ERLEIGRLGGERGSGPASCLRV1DVSGMWDQ 
RLVKLALLQLLRAFYGIKVKGVRVHRDCGTF 
ESSSTLIRVS*FGVPCNALAHFGVTHF*YILDF 
LGML 


1175 


2525 


A 


9399 


66 


397 


HESSRADRDKMDTRGSTYTDADPVNKSGGT 
AKMNKW SKG K V RDKLNNI. VLFDTAT YDKL 
CKEVPNYKLITLAVVSERLKIPGSLARAALHE 
LLSRGLI*LVIQHIAQVIY 


1176 


2526 


A 


9408 


2 


299 


LDLTHVLSLSISLTVTILGTTFGMVIPLLDVVY 
GERG Y AQNGDF * DAQLDDY SFSC Y SHAQVN 
G APNSLTRA YDDP* VKIS GLECQKVGAL VE V 
KCLNL 


1177 


2527 


A 


9416 


2 


402 


CNFLRSSRIRVHSTPAASTMPPKVDPNEIKVV 
YLKL ICjOEVKA I bALArKiurLUljooIiv VU VU 
FV+ATGDWNVL1ISVILTIRILLSH1FVVPPFFCF 
DHLIAFWDLQSLIFLHVIFSLFITLLLFCFFSIF 


1178 


2528 


A 


9419 


142 


426 


TPLFDLWPRWLSWLETVLTSLRTRRAASGPP 

ACR1MPTTVDDVLEHGGEVHFLQKQMLYLL 

ALI*DTFAPIYVGIVFLGFTPDHRCRSPGVAEL 


1179 


2529 


A 


9420 


1450 


1655 


LSSAGTKMNLN*KNYWPGASAHACNPSTLG 

GQSRCITRSGDRDHPG*HGETPSVLKIQKISRA 

WWRAP 
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D=As parti c Acid, E=Glutamic Acid, 
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TitlpIpotfHp inc#*rttrm 


1130 


2530 


A 


9422 


176 


375 


HRPQTTRPDWKPRT*PQGK*GRLSSEISPASPP 
SRFSRSTKPVPPKADPPARQKLTGVLHAPLLK 
L 


1181 


2531 


A 


9436 


2 


274 


PIAASLRMYNLQPYTEENLICTAFATMVETVP 

TAT? T7T FYRT TfiTPHriVr'PVi;* A mil/ATA ni?p\nj 

IYNGKPLPG ATPLLSLQLHQL AHLG S 


1182 


2532 


A 


9442 


3 


240 


VDKCSSKSIVLSEYCPHCMCSLSTDPKPFG QL 
LYPTEDYKLTFRARH 


1183 


2533 


A 


9444 


384 


3 


LKDFQPWALHDWPLFCCCTFLLFLVLECFTR 
KGCSGWAPWLSLQCQHFGRPRWADHLRSGV 
KUy rUl^ Y bit 1 1 r LrJvJ QKLAOH bG AH L* S* LL 
ERMRWKNRLNPGGRSCSEPRWHHCTPGWAT 


1184 


2534 


A 


9462 


391 


655 


LSGFKSLMPKIPLQYIYVRVRTTWSFCLPLDG 
RKLMLS*YSK*LT*KYNILPEYSRMTLPPGMV 
IHTCNPSTLGGRAGWIV*AQEFET 


1185 


2535 


A 


9467 


215 


566 


RCPMWQGQASRMDPAKAKDREASTCCSLA 

117117117/'" , YT.J"E/ r,, YI.n7l"> A 1 V I OCPn A /"""• r>I a />\tn r a ir 

W W WU WbC W VRAJ^KLoSGPAGPLACWVAK 
KKSLSLSGPVYPSEKGAGLYVF*DRVSLCHPG 
WSAWQFWLTAASNSCFSLLS S WDYRCA 


1 1X6 




A 




Z / J 




rUrQLH 1 K 1 H Y VF1 KM VNKI*QIDNSKPWQR 
GG*TGILTHCW*ESKLVQPLWKIVWHYQ 


1187 


2537 


A 


9469 


388 


3 


EVAPGPSQILPRRVTDGGDRPQFSLPGPRLPQ 
b bKUAbr CLbN CIH o r Ar RKQRJ^IGDSDQ * STP 
NPASPHPEAPQEPWDSASGSVGSFSLGRGAK 
ASS*VPGKGRGPRQGSELLAETILELFLALAN 
S 


1 188 


2538 


A 


9471 


I'M 


07 t 


1 MUKi^KlluNbLDMAbblHMTGPMCLIENTT 
GRLMANPEALKILSAITQPMVEEAIAGLYRAC 
* FYLTNNLAGMKKGLCLGSTEQAHTIGI 


1189 


2539 


A 


9480 


584 


769 


GHVQSQHFGRPRRADHLRSGDRDHPG*HDET 
PSLLKIQKISWAWWRAPVVPATWEAEAEEW 

K 


1190 


2540 


A 


9483 


463 


86 


VTVGLTLLLRGAPRFTAG*PPSGGGPPLAPLL 
PRQHCTLQTHRHLHPEAPVKV*KT* RLFPGLR 
OAooCKtOvKCWrVLAAKxvAG 
RRSRCPDTAHRRRRRGRRRNPSCVRSPRWR 


1191 


2541 


A 


9489 


1 


41 1 


T AF"iAT fl A A TO A VTJTV5 A T> A^DCTDDTll CD 

SVRVCCRAAAASNLLYSSCLQRHSERASEEG 
ERGSLSAKCCSLVLRGGCSSSNSHSFRR1T*EI 
MAAFVLLSYEQRPLKRPRLGPPDVYPPDPKQ 

KFFFT TAVWVTf 
r^dC/Ciij l /v v i\ v IV 


1192 


2542 


A 


9497 


389 


161 


VSFLSMSSGHCIRSTRGSKMVSWSVIAKIQEI* 

CEEDERKMAREFLAEFMSTYVMMNIHMIVE 

KDTYSDHEEINTS 


1193 


2543 


A 


9509 


186 


1 


IAKSQ* KRWQRSG AMETLKHG W WECKL VQF 
FGKTFVNVN* S*TYVYPCDKIILLLGLYPTEM 


1194 


2544 


A 


9512 


58 


433 


PLQRSKCLTLRCLRAKPWAWSQSPRACSSAL 
LKSSRSRASST NVOCTT OSNPOOT4nRI*ltriVA 

SSKGQQFRR*KEHPFMLKTLNKLRIEGT* LKI 
RRAIYDNPTANIIVEGQKLEAFPLRTGTRQ 


1195 


2545 


A 


9515 


595 


1223 


GHGAPSFQTQVPRTP*ASWPVVPAASESAPAP 

AGGGASLPVAAGSCAAAPHTEPGAPQHLLDC 

PCPLCLARPPRRPLPDTCYGPG SGRS ASLAEPP 

LPRCSCAPLRSASAPQVS*CV*AVNLLPHNL* 

PLHLLLHD+EKAWGFLFSSASHCFQGQICLLP 

APGSGPCGATARPSRGGRAGGSRARRPIPPGP 

GTRRTPSGCQNPAASGG 
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/^possible nucleotide deletion, \=possible 
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1 ft 


ZZy 




AERV APG WDLHTPYLPRTNSRRTPHL* * EPHA 

VJ I IVJiTXvT r IVlovJ vj Vv r UVJ V</ 


1197 


2547 


A 


9521 


289 


448 


IAWLSGLFFPSNQANLCFLCYKLTADSRYRG 
HAMRHT TYVNTSMATRFI *AD <: n?Ff>VnR ARVF 

APNWKYK YG Y* IPVDMLC 


1198 


2548 


A 


9524 


204 


1 


KNKKTTKCLSIVTLNISGPNQ*NKRHRVAEWI 
VKQEPNICHL*ETHFPFRDTYRLKEREQKKRK 
SSYS 


1199 


2549 


A 


9546 


1785 


1943 


GGRFKESKXTNAGWQRNSFFIGPPKSIPWAA 
V*QRGDGKNPGVTHLNRPVGTX 


1200 


2550 


A 

A 


9548 


186 


1 


VNAEKEF*KJQHYFMTKSQNKLHIEHTYLKPI 
KAJYDKWTSDIMLNLQKL*AFFLRV1VRQI 


1201 


2551 


A 


9549 


591 


2 


SSVVEFPRGPRSSLPPLDSTFPCGSSPNWTGGC 

GSCPSGE*LVSPGSEQRKKYSNSNVIMHETSQ 

YHVQHLATFIMDKSEAITSVDDAIRKLVQLSS 

KEKJWTQEMLLQVNDQSLRLLD1ESQEELEDF 

PLPTVQRSQTVLNQLRYPSVLLLVCQDSEQSK 

PDVHFFHCDEVEAELVHEYMESALTDCRLGK 

AMRP 


1202 


2552 


A 


9552 


428 


1 


KYGNEGHWSRQCPNPGKPIRPCPLCRGPHWK 

LDCERPPQGPLPSLPELAKTSYSDLTGLATED 

*WGPGMDAPATT1ASSKTRVTLMVAGRPVFF 

LI*YRATYSALPNFSGPTQSSQVSWGILXjQV 

SKPRATPPLFCSLHTF 


1203 


2553 


A 


9568 


517 


738 


RRKFERKQKQ* RYREGKQYRQRDKMKE WG 
EKEKRRREKGEREERKMRHRERKGESGQRD 
I MEN W R VE RLTEKER 


1204 


2554 


A 


9573 


83 


415 


EDKRLRLVDGDSRCAGRV*IYHDGFWGTICD 
DGWDLSDAHVVCQKLGCGVAFNATVSAHFG 
EGSGPIWLDDLNCTGTESHLWQCPSRGWGQ 
HDCRHKEDAGVICSEFTALR ' 


1205 


2555 


. A 


9577 


64 


424 


ARGSCPTRPRTANGRMGETKDAPQML V TFK 
DVAVTFFREEWRQLVLVHRTLYR*GMLETC 
GLLDTLRHNVPQPDVVHLLYHGTQLLIVKRE 
VSHSPCAGDMRELFTREATLTPHPYNNGA 


1206 


2556 


A 


9584 


38 


476 


TLGAVLFSEVSKESSTSHSGGQLGRQNRHPKL 
SNFITPSSPRLKr* TAS SQRNLGQILNMFLT AV 
NPQPLSTPSWQIETKYSTKVLTGNWMEERRK 

/~>r DVVUT ITUUr\r DDUD V T I CTVnf\UV\ID UP 

YNPGLPPLRTWNGQKJLLWL 


1207 


2557 


A 


9586 


2 


412 


LRSSPAALLRALCITTVTGTALALRSRVATTN 
PDGCRNVLRPKYYRLCDKAESWGIALETVPT 
GVAVTSWA1MLTVLTLVCKGQDYNRRQKLP 
1 HlLCLL*fcK.vjIrUL 1 r ArllGLDGo I Or I KrFL 
FGILFSICFS 


1208 


2558 


A 


9597 


122 


3 


IKNYWPGMVAHACNPSPLGGRGRWIA*AQK 
FADAWADAW 


1209 


2559 


A 


9611 


148 


558 


KSIJINVWDLLNNTWKADRFFCHSSRTSTIRK 
GDPGPTFSKMSIWTSGRTSSSYRHDEKRNIYQ 

MVCSIMM*FLLGITLLRSYMQSVWTRESQCT 
LLNASITETFNC 


1210 


2560 


A 


9618 


384 


2 


SLHDMLMLAEQQQKQKWAVNTQNTAWSNA 
DSKFGQRlLEKiMEWSKGRGLGVQEQGGPDDI 
KVQVKNNDLGLQATINNEANWIAHQDDFNW 
LLAELNTCQRQETADS* * * WSPKNSH VGKDS 
GELSAK 


1211 


2561 


A 


9620 


316 


610 


QKHPGGGQLGRSPQEDSRFHNKASSGVSRVR 
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/^possiDie nucleotide deletion, Y^possible 
nucleotide insertion 














LGRAWWLTPVIPTLWEAKAGGSPE*D*AGRG 
GSRL* SQHFGRPRRVDHLRSAVQDQPGQHGE 
1 roLLKJl^KJN * V WGKKL'ob Y oEAEAGEoL 


1212 


2562 


A 


9623 


297 


344 


QFPVDGDYQKIEKITQLFQAQNLSLCLAMTR 
I KJtL'KOUGKuKHE* AVWrLKJCGGYGVKAP 
AILNTSNCT*CF*ETKMLSDDPKACVFEVSSA 
DL*NTSFGVIR 


1011 




A 

A 


yoz4 


Z 


35o 


AbLbLAS I ACGRNTSGDSLPDYDRAPISSPLA 
TSGTILSAISCLWDLPTPVLRVGLSCQPSMSSQ 
IPRMYSTDVEAAVNSLEDLYLQAYYAYLCVG 
LYFHRDDMALEGVSRFL*ELAE 


1 0\A 
1 Z It 


ZD 04 


A 

A 




/ ft 




bLbR W VRAKL* VPYNQENCLNPRGGGC SEPR 
SHYCTPAWATEKDS 


1215 


2565 


A 


9636 


220 


426 


KPGNFAVSSEY*DITSGQLKTAVRG*IEMTST 
EENFGEKLHDIGFGNGFLDKT*KAQATKAK1 
DK 


1 2 1 6 


2566 


A 


9637 


391 


76 


CFLEDGCTQAS*AEEAAVSPSMAEEEQGSTSC 
RERRSIRFKMKNHSPDDTIKENVTISNIRTRKI 
NHLPETERNLLEHGLMYIRLNAAFCSLVAHS 
LFGFILKAT 


1217 


2567 


A 


9655 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKV 

EEHHLQPVQVLQTLLHSATAGTGCRRPARPP 

PAPPTPTPWRSRQSGKQSERAS*LKGRGRYGL 

GALGGRGGRALGGSRWPPPLPGETLFSGCKH 

RRRRRGSDAAPGEEAGT 


1218 


2568 


Ml 

A 


9658 


3 


405 


HASARALLSPNLSPNNKMAI SGGPVLGFFII A 
VLMSAQEPWAIKEEHVIIQAEFYLNPDQSGEF 

M1X)FEGEDTFHGDMAKKETVWRLE*LARLD 
NFEAQRALANIAADQAALEIMDMGSDYTLIP 
NVPPKVTVL 


1219 


2569 


A 


9662 


3 


284 


PDWTEKRKMQDTGSILPLHWFGFGYAALVA 
YGGIIG Y VKAGS VPSLAAGLLFG SLSGLGAYQ 
LSQDPRNVWVFLATSGTLAGIMGMRFYHSG 
KL 


1220 


2570 


A 


9669 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSA 

PGNTSLCTRDYKITQVLFPLLYTVLFFVGLITN 

GLAMR1FFQIRSKSNFIIFLKNTVISDLLMILTF 

PFKILSDAKLGTGPLRTFVCQVTSVIFYFTMYI 

SISFLGLITIDRYQKTTRPFKTSNPKNLLGAKIL 

K 


122] 


2j7i 


A 


9676 


ISA 

164 


562 


KERDSSTFSAAMTTMQGMEQAMPGAGPGVP 
QLGNMAVIHSHLWKGLQEKFLKGEPKVLGV 
VQILTALMSLSMGITMMCMASNTYGSNPISV 
YIGY 1 1 WGSVMFIISGSLSIAAGIRTTKGLVRG 
SLGMNITSS 


1222 


2572 


A 


9688 


43 


412 


VAKMVKCCSAIGCASRCLPNSKLKGLTFHVF 
PTDENIKRKWLAMKRLDVNAAGnVEPKKG 
DVLCSRHFKKTDFDRSAPNIKLKPGVIPSIFDS 
PYHLQGKREKLHCRKNFTLKTVPATNYNH 


1223 


2573 


A 


9696 


308 


564 


RTSMGILYSEPICQAAYQNDFGQVWRWVKE 
Dab Y AN VyUOr NOD 1 r L1CACRRGHVRI VSFL 
LKKECLCQPQKPERENLLALCCE 


1224 


2574 


A 


9700 


3 


632 


DAWASGGELGSLFDHHVQRAVCDTRAKYRE 

GRRPRAVKVYTINLESQYLLIQGVPAVGVMK 

ELVERFALYGAIEQYNALDEYPAEDFTEVYLI 

KFMNLQSARTAKRKMDEQSFFGGLLHVCYA 

PEFETVEETRKKLQMIIKAYVVKTTENKDHY 

VTKJOaVTEHKDTEDFRQDFHSEMSGFCKA 

ALNTSAGNSNPYLPYSCELPLCYFSSK 
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SEQ ID 
NO: ot 
nucl- 
eotide 
seq- 
uence 


SEQID 
NU: oi 
peptide 
seq- 
uence 


Met 
noo 

j 


SEQ 

I TV VWV 

in 

USSN 

09/496 

914 


Predicted 
beginning 
nucleotide 
locauon 
correspond i 
ng to first 
amino acid 
residue of 

sequence 


Predicted end 

nucieouuc 

location 

vOl l Caput 1U 1 1 1 g 

to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine ) 

r~% A c* t~i n ?~f i A irl • — lilt qttI l f* A /*M i~? 

F^Phenylalanine, G=Glycine, H=Histidine, 

I^Icnl^itfinp K"=T vcinc 1 — T piif inf* 

M=Methionine, N=Asparagine, P-Proline ? 
Q=Glutamine, R=Argimne, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y= Tyro sine, X=Un known, *=Stop codon, 
^no^sible nucleotide deletion \=nossible 
nucleotide insertion 


1225 


2575 


A 1 9710 

f 

1 


1 


163 


RSGCVLRMTEWETGAPAVAETPDIKLFGKWS 
TDDVHINDISLQDY1AGVRLILL 


1226 


2576 


A 


9713 


82 


492 


QGLPSFLPAFGPSGSWLGPAPTLGSSCNTVDT 
TPHGYSFfRPT FYT SFPDT 1 1 HI PWI TFTF f Yfi 

ASVANKDHCYNLQAVGQIFYISSFLYTVNYI 

WYLYTELRMKHTOSGOSTSPLVIDYTCRVCO 

MAFVFSSLI 


1227 


2577 


A 


9720 


3 


416 


GKWKRTQVPLLGEECADMDLARKEFLRGNG 

LAAGKMN1SIDLDTNYAELVLNVGRVTLGEN 

NRKKMKDCQLRKQQNENVSRAVCALLNSGG 

GVIKAEVENKGYSYKKDGIGLDLENSFSNML 

PFVPNFLDFMQNGNYF 


1228 


2578 


A 


9723 


278 


411 


EASSSNTVASNVADKTDPHSMNSRVFIGNLN 
TLVLQKSDVEAVF 


1229 


2579 


A 


9725 


111 


yuz 


L*r/\JYloOr dNLN lL)r I y£l o Y oll^LH^ol^t^o Y U X 

GGSGGPYSKQYAGYDYSQQGRFVPPDMMQP 
QQPYTGQIYQPTQAYTPASPQPFYGNNFEDEP 

NETDLAGPMVFCLAFGATLLLAGKIQFGYVY 

GISAIGCLGMFCLLNLMSMTGVSFGCVASVL 

GYCLLPM1LLSSFAVIFSLQGMVGIILTAGIIG 

WCSFSASKIFISALAMEGQQLLVAYPCALLYG 

VFALISVF 


1230 


2580 


A 


9739 


11 


247 


TFVLNMNTPKEEFQDWPIVRJAAHLPDLIVYG 
HFSPERPFMDYFDGVLMFVDISGKCKRDVCL 
MWMSNRLAWEFTCRA 


1231 


2581 


A 


9744 
■ 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPP 

ACRIMPTTVDDVLEHGGEFHFFQKQMFFLLA 

LLSATFAPIYVGIVFLGFTPDHRCRSPGVAELS 

YEVDWNQSTFDCVDPLASLDTNRSRLPLGPC 

RDGWVYETPGSSIVTEFNLVCANSWMLDLFQ 

SSVNVGFFIGSMSIGY1ADRFGRKLCLLTTVLI 

XI A A AfWT \A A TQPTVT WK/il TFT2T TOfiT V^T^Afl 
JN/V/Vr\VJ V LdVI/\Ior 1 I 1 W lVil_,_LT rvL/lV^vJJL. V Orv/WJ 

WLIGY1LITEFVGRRYRRTVGIFYQVAYTVGL 
LVLAGVAYALPHWRWLQFTVALPNFFFLLY 
YWCIPESPRWLISQNKNAFAMRIIKHIAKKNG 
KSLPASL 




ZJOZ 


A 


yy:>3 




j i / 


PfTPflMnfiPPPTTPT^W^I PPWRAYVAAAV1 P 

YINLLNYMNWFIIAGVLLDIQEVFQISDNHAG 
LLQTVFVSCLLLSAPVFGYLGDRHSRKATMS 
FGILLWSGAGLSSSFISPRYSWLF 


1233 


2583 


A 


9757 


25 


419 


LPAPWTERVRKSEGLVGTCLGDPMASPRTVT 

TV AT <3VAT fil FFVFMfTTTkl TPRT 91fr>AV<!FM 

KRAYKSYVRALPLLKKMGINSILI..RKSIGALE 
VACGIVMTLVPGRPKDVANFFLLLLVLAVLF 
FHQLV 


1234 


2584 


A 


9765 


71 


456 


RLELDWGFSLHFLPVAYLCPLSSGFEMNVQP 
CSRCGYGVYPAEKISCIDQIWHKACFHCEVC 

fNvTNvfT QV>JWFVQT-inK'1< r PVPH A P>JPK'>J X JTFTQ 

WHTPLNLNVRTFPEAISGIHDOEDGEOCKSV 
FHWD 


1235 


2585 


A 


9767 


52 


559 


IRSGAMSVDKAELCGSLLTWLQTFHVPSPCA 

SPQDLSSGLAVAYVLNQIDPSWFNEAWLQGI 

SEDPGPNWKLKVTSGLLIRGQTGEEMTRDGP 

ARHMSWVMGRKRDRCLVINHLFIHSSMEYSP 

CARPGHSARNNTDKNLPHTA1ILYTSNTYTTI 

KINFQAGRSGSCL 


1236 


2586 


A 


9770 


352 


608 


FRGEALTVRFLTKRFIGEYASNFESIYKKHLC 
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NO: of 
nucl- 
eotide 
seq- 
uence 



1237 



1238 



SEQ ID 
NO: of 
peptide 
seq- 
uence 



2587 



Met 
hod 



! SEQ 
ID NO: 
in 

USSN 
09/496 
914 



2588 



9793 



9802 



Predicted 


Predicted end 


beginning 


nucleotide 


nucleotide 


location 


location 


corresponding 


correspond i 


to last amino 


ng to first 


acid residue 


amino acid 


of peptide 


residue of 


sequence 


peptide 




sequence 





266 



537 



515 



967 



Amino acid sequence (A^Alanine C=Cysteine a 
D=Aspartic Acid, E==GIutamic Acid, 
F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
QKJIutamine, R^Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown» *-Stop codon, 
/=possible nucleotide deletion, possible 
nucleotide insertion 



LERKQLNLEIYDPCSQTQKAKFSLTSELHWA 
DGFVIVYDISDRSSFAFAKALI 



NILAIIYFPFPRLFLLRDSQSNPKAFALTLCHH 
QKIKNFQELPVSIDALTPPLWCFLVSFLTHFS 
RYKPTRPVCITQFQGCS 



ELGAGRSDREAMEAA VKEEI SVEDEA VDKNI 
FRDCNKIAFYRRQKQWLSKKSTYRALLDSVT 
TDEDSTRFQIINEASKVPLLAEIYGIEGNIFRLK 
INEETPLKPRFEVPD VLTSKPSTVRLISC SGDT 
GSLILADGKGDLKC 



1239 



2589 



9805 



105 



540 



VPGDPAMVRAGAVGAHLPASGLDIFGDLKK 

MNKRQLYYQVLNFAMIVSSALMIWKGLIVLT 

GSESPIWVLSGSMEPAFHRGDLLFLTNFRED 

PIRAGEIWFKVEGRDIPIVHRVDCVHEKDNG 

DIKFLTKGDNNEGDDRGSYK 



1240 



2590 



9819 



305 



1241 



1242 



1243 



1244 



1245 



2591 



2592 



2593 



2594 



2595 



1246 



1247 



2596 



2597 



TDGRDPLPCAARRRGGGGECCGAGWVAEWS 
PQPLDPAMLLWMQGFVLEAVACQDNDDYLR 
YGILFEDLDCNGDGWDIIELQEGLRNWSSAF 
DPNSEEHG 



9834 



841 



1209 



9843 



9846 



9848 



SPARGKSNRTDVMITAPKNKKMTENLAAPEA 
LDSSTHS SSTATQSRAKMNTPAPTPSTVPAIPR 
GGSGGPPPCAPHDRVSSVLQCDTQAMDHKTE 
SSHSVVEFLFKRTKTPSPFHPAVRENRN 



589 



198 



411 



116 



650 



9849 573 



9850 



9851 



1620 



114 



464 



327 



TISCGPATEPPASLLSSASSDDFCKEKTEDRYS 

LGSSLDSGMRTPLCRICFQGPEQGELLSPCRC 

DGSVKCTHQPCLIKWISERGCWSCELCYYKY 

HVIAISTKNPLQWQAISLTVIEKVQVAAAILGS 

LFLIASISWLIWSTFSPSARWQRQDLLFQICYG 

MYGFMDVMIVAVDSEDMVQAAKEVGKRWS 

DIPP 



WRISHHAGKMPVMKGLLAPQNTFLDTIATRF 
DGTHSNFDLANAQVAKGFPIVYCSDGFCELAG 
FARTEVMQ 



PICGFLYLCSAMASESSPLLAYRLLGEEGVAL 

PANGAGGPGGASARKLSTFLGWVPTVLSMF 

SIVVFLRIGFVVGHAGLLQALAMLLVAYFILA 

LTVLSVCAIATNGAVQGGGAYCILQHRWTG 

VWPVLPAREVMISRTLGPEVGGSIGLMFYLA 

NVCGCAVSLLGLVESVLDVFGA 



KSKCRFPEGLSEGFGPMRKEALSSGSVQEAE 

AMLDEPQEQAEGSLTVYVISEHSSLLPQDMM 

SYIGPKRTAVVRGIMHREAFNIIGRRIVQVAQ 

AMSLTEDVLAAALADHLPEDKWSAEKKRPL 

KSSLGYEITFSLLNPDPKSHDVYWDIEGAVRR 

YVQPFLNALGAAGNFSVDSQILYYAMLGVNP 

RFDSASSSYYLDMHSLPHVINPVESRLGSSAA 

SLYPVLNFLLYVPELAHSPLYIQDKDGAPVAT 

N AFH SPR WGGIM V YN VDSKTYNAS VLPVRV 

EVDMVRVMEVFLAQLRLLFGIAQPQLPPKCL 

LSGPTSEGLMTWELDRLLWARSVENLATATT 

TLTSLA 



PPQLGAQRVREPRHPDVRAPLRVTSPGLRSRS 
ARSLGRRPRJAMVTVGNYCEAEGPVGPAWM 
QDGLSPCFFFTLVPSTRMALGTLALVLALPCK 
RRERPAG ADSLS WGAGPRI SS YV 



F VRNKKMTRSC S A VGCSTRDTVLS RERGLSF 
HQFPTDTIQRSKWIRAVNRVDPRSKKTWIPGP 
GAILCSKHFQESDFESYGIRRKLKJCGAVPSVS 
LYKVFKYSSRCTS 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, 
D=Aspartic Acid, E^GIutamic Acid, 
F=Phenylalanine, G=G1ycine, H=Histidine, 
I-Isoleucine, K=Lysine, L -Leucine, 
M^Methionine, N=Asparaginc, P=Proline, 
Q=Glutamine, R=Arginine > S=Serine, 
^Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X= Unknown, *=Stop codon, 
/=possib)e nucleotide deletion, \=possible 
nucleotide insertion 


1248 


2598 


A 


9853 


58 


444 


RVDDFV YSKG GKDAGGAD VSL ACRRQSIPEE 

FRGITVVELIKKEGSTLGLTISGGTDKDGKPR 

VSNLRPGGLAARSDLLNIGDYIRSVNGIHLTR 

LRHDEIITLLKNVGERWLEVEYELPPPGGCP 
WT 


1249 


2599 


A 


9856 

i 

i 

\ 


2 


1265 


LPPPRPSRHRRGRAGTRASAAAAAGPTVSAV 

RAPVRGQDSGAGTPQGRLAGRGAHLSRVGA 

SG SG VAAGPAARHAPRRRCADAG EAVGASC 

GRCAVALLSGVCTL VSTHVCVGSGCPGAA GT 

PMG AGDAG AS AES A VTTAPQEPP ARPLQA G S 

GAGPAPGRAMRSTTLLALLALVLLYLVSGAL 

VFRALEQPHEQQAQRELGEVREKFLRAHPCV 

SDQELGLLIKEVADALGGGADPETNSTSNSSH 

SAWDLGSAFFFSGTIITTIGGGGDWHVGGGK 

ELPHGGRCRETEGSQVAPRLPASPLCPGYGN 

VALRTDAGRLFCIFYALVGIPLFGI1 XAGVGD 

RLGSSLRHGIGHIEAIFLKWHVPPELVRVLSA 

MLFLLIGCLLFVLTPTFVFCYMEDW SKLEAIY 

FVIVTLTTVGFGDYVA 


1250 


2600 


A 


; 9873 


2 


652 


FWPSPCGGIPGRAPNGASRPTMGNSASRNDF 

EWVYTDQPHTQRRKEILAKYPAIKALMRPDP 

RLKWAVLVLVLVQMLACWLVRGLAWRWLL 

FWAYAFGGCVNHSLTLAIHDISHNAAFGTGR 

AARNRWLAVFANLPEGVPYAASFKKYHVDH 

HRYLGGDGLDVDVPTRLEGWFFCTPARKLL 

WLVLQPFFYSLRPLCVHPKAVTRMEVLNTLV 

QLA 


1251 


2601 


A 


9875 


150 


1209 


PVIMPLHFSPGDIVRPSCCVSSSPKLRRNAHSR 

LESYRPDTDLSREDTGCNLQH1SDRENIDDLN 

MEFNPSDHPRASTIFLSKSQTDVREKRKSLFIN 

HHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTI 

KCVALAIYYHIKNRDPDGRMLLDIFDENLHPL 

SKSEVPPDYDKHNPEQKQIYRFVRTLFSAAQL 

TAECAIVTLVYLERLLTYAEIDICPANWKRIV 

LGAJLLASKVWDDQAVWNVDYCQILKDITVE 

DMNELERQFLELLQFNINVPSSVYAKYYFDL 

RSLAEANNLSFPLEPL SRERAHKLE A1SRLCED 

KYKDLRRS ARKRSASADNLTLPRW SPAIIS 


1252 


2602 


A 


9879 


6 


376 


KRPD SRPPAQYRAGPTRPRTRGCELLYWKAT 
KA VGIKMG SLSTANVEFCLD VFKELNSNNIG 
DNIFFSSLSLLYALSMVLLGARGETEEQLEKV 
WNSSEVCSEPRSLSCSRSGSAKLILSLYQ 


1253 


2603 


A 


9880 


180 


388 


KEQAELLYGLYCQCDLTLSSHPSSVPAMSSC 

NFTHATFVLIGIPGLEKAHFWVGFPLLSMYVA 

ANCFGNC 


1254 


2604 


A 


9881 


19 


494 


VISFQIITDTIMDSSTAHSPVFLVFPPEITASEYE 
STELSATTFSTQSPLQKLFARKMKILGTIQILF 
GIMTFSFGVIFLFTLLKPYPRFPFIFLSGYPFWG 
S VLFINSGAFLIAVKRK 1' 1K1LIILSRIMNFLSA 
LGA1AGIILLTFEFHPRSKLHL 






A 

A 


9896 


72 


386 


RPGREQRDCFQAPPLGLGGRQTDMMHHPLT 
GATCVGLPNVGMCPQLSGALTFMYLQQGNQ 

EATVAPDTMAQPYASAQFAPPQNGIPGEYTA 
PHPHPAPEYTGQTT 


1256 


2606 


A 


9902 


95 


399 


SGGPAGLLHRPVLPXMGLSGLLPIL VpFILL G 
DIQEPGHAEGILGKPCPKIKVECEVEEIDQCTK 
PRDCPENMKCCPFSRGKKCLDFRKVSLTLYH 
KEELE 


1257 


2607 


A 


9905 


374 


459 


EHLKSTPNRLGWAHTCNPSTLGGRGGW 
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seq- 
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Met 
nod 
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nucleotide 
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correspond i 
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amino acid 
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to last amino 
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of peptide 
sequence. 


Amino acid sequence (A= Alanine C=Cysteine, 
D=Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, 
M=Methionine,N~Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y-Tyrosine, X^Unknown, *=Stop codon, 
possible nucleotide deletion, Y=possible 
nucleotide insertion 


1258 


2608 


A 


9911 


364 


1974 


AGPGVPAVGGRWASGPGLGGRTLCSGPPDH 

QRRGPSCGASGDPQCVGSPHPQRARPLLARP 

GARLLPGHLPSPRPPRLPTGQPPAAAFRGPVR 

PQGGGHIHPLPTPGGRPCFAVSEGSGSALLLS 

YLGECGSSSYVTGAACISPVLRCREWFEAGLP 

WPYERGFLLHQKIALSRYATALEDTVDTSRL 

FRSRSLREFEEALFCHTKSFPISWDAYWDRND 

PLRDVDEAAVPVLCICSADDPVCGPPDHTLTT 

ELFHSNPYFFLLLSRHGGHCGFLRQEPLPAWS 

HEVILESFRALTEFFRTEERIKGLSRHRASFLG 

GRRRGGALQRREVSSSSNLEEIFNWKRSYTRL 

MAAAAGAAAAPGSREPQDRPECGAGHPGPR 

Y YRHPERWLLRPE AFLGPLRTRAP S AED S QR 

ERPAARSGPEMRVRYPWAAVLAPYLALSQD 

PMYKSSASGQGASGSYNHVREEMLIKAGGA 

MSRRWRQSICFRHVFGQA.MCADQAYEDIRV 

SK VTWDS SFCA VNPKFLAIIVEAGGGG AFI VL 

PLAK 


1259 


2609 


A 


9919 


693 


935 


GCFKFIGESTCCWIFPSSVTTQCWAKAPRAA 

TLSKAERLRSQPGPEQGGSSYRPRTPTAAAIL 

PPRPGRSHRKRKLVSTK 


1260 


2610 


A 


9921 


455 


1082 


QRSCLCSAIEKDGGDVKALYRRSQALEKLGR 

LDQAVLDLQRCVSLEPKNKVFQEALRNIGGQ 

IQEKVRYMSSTDAKVEQMFQILLDPEEKGTE 

KJCQKASQNLVVLAREDAGAEKIFRSNGVQLL 

QRLLDMGETDLMLAALRTLVGICSEHQSRTV 

ATLSILGTRRVVSILGVESQAVSLAACHLLQV 

MFDALKEGVKKGFRGKEGAIIV 


1261 


2611 


A 


9928 


1 


438 


GFRGAEAPGAAQAPKKKKPRPTEGGPGAGSG 
RGKDPYRGPTLLHQPKPPKDEFLSSLESYEIAF 
PTRVDHNGALLAFSPPPPQRQRRGTGATAES 
RLF YKE ASPSTHFLLNLTRSSRLLAGHV SVEY 
WTREGLAWQRADRPHCLYA 




zolZ 


A 

A 


9931 


168 


435 


AAEMGRAGAAAVIPGLALLWAVGLGGPPPA 
PPRLPFCLQELQGRHALHTFSLERTCSYQDFL 
WADEGRLLHVGAQDLATWHTLSPLGLW 


1263 


2613 


A 


9938 


247 


488 


RMSATSVDQRPKGQGNKVSVQNGSIHQKDG 
CNDDDFEPYLRSPDNQSNSYPPMSDPYMPGY 
YAPSIGFPYSLGEAAW SQL 






A 

A 


9941 


*r i 

61 


277 


ESIGLTALGPRRRPWEHRWSDPITLKMKGWG 
WLALLLGALLGTAWARRSQDLHCGACKAVR 
RRVRQFNIYDY 


1265 


2615 


A 


9956 


2 


522 


FVASEVSKMPVPASWPHPPGPFLLLTLLLGLT 

EVAGEEELQMIQPEKLLLVTVGKTATLHCTV 

TSLLPVGPVLWFRGVGPGRELIYNQKEGHFP 

RVTTVSDLTKRNNMDFSIRIS SITPADVGTYY 

CVKFRKGSPDHVEFKSGAGTELSVRGEYSVG 

FLSQVWWWLSSHPFMN 


L/oo 


2616 


A 


10002 


243 


387 


PKNNACHLLFTAVCQPRCKHGECIGPNKCKC 
HPGYAGKTCNQGRKTV 


1267 




A 






nan 
/U/ 


LPAiWo I Wb VAKb I MASbb VFPAT VSAATAG 

PGPGFGFASKTKKKHFVQQKVKVFRAADPLV 

GVFLWGVAHSINELSQVPPPVMLLPDDFKAS 

SKJKVNNHLFHRENTLPSFIFKFKEYCPQVFRNL 

RDRFGIDDQD YL VSLTRNPPSESEG SDGRFLIS 

YDRTLVIKEVSSEDIADMHSNLSNYHQVRPLS 

SPILSLSSLLTYSSAIVSNRCQLGRKLIGRENP 


1268 


2618 


A 


10005 


2 


209 • 


GEGYELFVPSNGVPAVCHMVGRRPHRAVLSP 
SQDELEHSLGESAAQGAAGWLWVSWENTR 
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TKVSLGLA 


1269 


2619 


A 


10010 


245 


688 


FGMLKNKGHSSKKDNLAVNAVALQDHILHD 
LQLRNLSVADHSKTQVQKKENKSLKRDTKAI 
IDTGLKKTTQCPKLED SEKE YVLDPfCPPPLTL 
AQKLGLIUPFrrFLbbUbWbKVKQKbLLQGDS 
VQPCPICKEEFELRPQVFSIRG 


1270 


2620 


A 


t Art t t 

1001 1 


2 


CO o 

588 


RVDDFVRPLPPGLMSRSRASIHRGSIPAMSYA 
PFRDVRGPSTHRTQYVHSPYDRPGWNPRFCII 
SGNQLLMLDEDEIHPLLIRDRRSESSRNKLLR 
RTVSVPVEGRPHGEHEYHLGRSRRKSVPGGK 

t~W/Q\ jrC/~* ADA A DET> TiO/^/TT ODDI l/CPTVTl 'T'L'O 

Y bMbU Ar AArr Krot^ur i^KKJJvbblKR I K.S 
QPKLDRTSSFRQILPRFRSADHDRYRGWSMW 
DEIDV 


1271 


2621 


A 


10013 


209 


363 


LPAPPNLSPRLSFGFQFPGGNDNYLTITGPSHP 
FLSGAEVSQSCRRRGGRA 


1272 


2622 


A 


10014 


7 


388 


SAVTISWKWRSVMGIQTSPALLASLGAGLVT 
LLGLAVGSYLVRRSRRPQVTLLDPNEKDLLR 
LIDKTLSARSPCKHIYLSTRIDGSLSIRPYTPVT 
SDEDQGYVDIDIKVYLKGVHPTFPEGGKMSH 


1273 


2623 


A 


10016 


1 


1339 


MAARTLGRG VGRLLG SLRGL SGQPARPPCG V 

SAPRRAASGPSGSAPAVAAAAAQPGSYPALS 

AQAAREP AAFW GPL ARDTL VWDTPYHTV W 

DCDFSTGKIGWFLGGQLNVSVNCLDQHVRKS 

PESVALIWERDEPGTEVRITYRELLETTCRLA 

NTLKRHGVHRGDRVAIYMPVSPLAVAAMLA 

C ARIG A VHTVIFAG FSAESLAGRINDAKCK W 

ITFNQGLRGGRVVELKKIVDEAVKHCPTVQH 

VLVAHRTDNK VHMG DLDVPLEQEMAKEDP 

VCAPESMGSEDMLFMLYTSGSTGMPKGIVHT 

QAG YLL YAALTHKL VFDHQPGDIF GCV ADIG 

WITGHSYVVYGPLCNGATSVLFESTPVYPNA 

GRYWETVERLKINQFYGAPTAVRLLLKYGD 

A W VKKYDRSSLRTLG SVGEPINCEA WE WLH 

RWGDSRCTLVDTWWQT 


1274 


2624 


A 


10017 


1 


3750 


FRPQGTPRSPASHVLTMSAPDEGRRDPPKPKG 

KTLGSFFGSLPGFSSARNLVANAHSSARARPA 

ADPTGAPAAEAAQPQAQVAAHPEQTAPWTE 

KELQPSEKMVSG AKDL VCSKMSRAKD A VS S 

GVASVVDVAKGWQGGLDTTRSALTGTKEV 

VSSGVTGAMDMAKGAVQGGLDTSKAVLTG 

TKDTVSTGLTGAVNVAKGTVQAGVDTTKTV 

LTGTKDTVTTGVMGAVNLAKGTVQTGVETS 

KAVLTGTKDAVSTGLTGAVNVARGSIQTGV 

DTSKTVLTGTKDTVCSGVTGAMNVAKGTIQT 

GVDTSKTVLTGTKDTVCSGVTGAMNVAKGT 

IQTGVDTSKTVLTGTKDTVCSGVTGAMNVA 

KGTIQTGVDTTKTVLTGTKNTVCSGVTGAVN 

LAKEAIQGGLDTTKSMVMGTKDTMSTGLTG 

AANVAKGAMQTGLNTTQNIATGTfCDTVCSG 

VTGAJV^l^GTIQTGVDTTXJVLTGTKDTVC 

SGVTGAANVAKGAVQGGLDTTKSVLTGTKD 

AVSTGLTGAVNVAKGTVQTGVDTTKTVLTG 

TKDTVCSG VTS A VN V AKGA VQGGLDTTK S V 

VIGTTCDTM STGLTG AANVAKGAVQTG VDTA 

KTVLTGTKDTVTTGLVGAVNVAKGTVQTGM 

DTTKTVLTGTKDTIY SGVTSA VNV AKG A VQT 

GLKTTQNIATGTKNTFGSGVTSAVNVAKGAA 

QTGVDTAKTVLTGTKDTVTTGLMGAVNVAK 

GTVQTSVDTTKTVLTGTKDTVCSGVTGAAN 
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VAKGAIQGGLDTTKSVLTGTKDAVSTGLTGA 

VKLAKGTVQTGMDTTKTVLTGTKDAVCSGV 

TGAANVAKGAVQMGVDTAKTVLTGTKDTV 

CSGVTGAANVAKGAVQTGLKTTQNIATGTK 

NTLGSGVTGAAKVAKGAVQGGLDTTKSVLT 

GTKDAVSTGLTGAVNLAKGTVQTGVDTSKT 

VLTGTKDTVCSGVTGAVNVAKGTVQTGVDT 

AKTVLSGAKDAVTTGVTGAVNVAKGTVQTG 

VDASKAVLMGTKDTVFSGVTGAMSMAKGA 

VQGGLDTTKTVLTGTKDAVSAGLMGSGNVA 

TGATHTGLSTFQNWLPSTPATSWGGLTSSRT 

TDNGGEQTALSPQEAPFSGISTPPDVLSVGPEP 

AWEAAATTKGLATDVATFTQGAAPGREDTG 

LLATTHGPEEAPRLAMLQNELEGLGDIFHPM 

NAEEQAQLAASQPGPKVLSAEQGSYFVRLGD 

LGPSFRQRAFEHAVSHLQHGQFQARDTLAQL 

QDCFRL 


1275 


2625 


A 


10025 


124 


415 


TILARKKEKTCPCKKEIGRNSRSGMYSRKAM 
YKRKYSAANTKVEKKKKEKVLAPVTKPVGG 
DKNGGTRVVKLPTMPRYYPTEDVPRKLLSHG 
KKPFS 


1276 


2626 


A 


10030 


3 


507 


GGSLRFSPPRVPSCSRVFCPVPPGGCGLPSPMS 

ASRPQSPTTPWCLPRRYMKHKRDDGPEKQED 

EAVDVTPVMTCVFVVMCCSMLVLLYYFYDL 

LVYVVIGIFCLASATGLYSCLAPCVRRiLPFGK 

CRIPNNSLPYFHKJRPQARMLLLALFCVAVSV 

VWGVFRNEDQ 


1277 


2627 


A 


10035 


C A 

51 


869 


YSRFTVPLPATMASSEVARHLLFQSHMATKT 

TCMSSQGSDDEQEKRENIRSLTMSGHVGFESL 

PDQLVNRSIQQGFCFNILCVGETGIGKSTLIDT 

LFNTNFEDYESSHFCPNVKLKAQTYELQESN 

VQLKLTIVNTVGFGDQrNKEERQLGRSQSTEN 

PQKYRSEQIIPVEPKKCTSFWKGALGKWAGIE 

SSGQSAQQPYLPINSPPHRLADVADVHLFSSV 

LSGAFGCYHLDVTVNEFKKQQNRDEQEGYS 

KuDQEQG 0 WKHG ADPLRGG EM 


1278 


2628 


A 


10036 


3 


457 


RAFD VRRKKSLRPCCPRDFHAGCLTV S GPST 
VMGAVGESLSVQCRYEEKYKTFNKYWCRQP 
CLPIWHEMVETGGSEGWRSDQVIITDHPGDL 
1 r I V lbfcML I ADUAOKYRCuIAIlLQEDGLSG 
FLPDPFFQVQVLVSSASSTENSVKTP 


1770 




A 
/A 


i hftiQ 
1 \j\Jjy 


71/f 




NUbLr VrMb5> WRbUAKAPSbbbA WRRS AATRR 
SRKCLRTKRKRWSSGKGTQMQSTLSETPRRA 
QMPCMWWYPFWG 


1280 


2630 


A 


10043 


2 


344 


RATWHNAGKEREAVQLMAG AEKR VKA SHS 
FLRGLFGGNTOJEEACEMYTRAANMFKhdAK 
NWSAAGNAFCQAAKLHMQLQSKHDSATSFV 
DAGNAYKKADPQGKTARHVACYLCV 


1281 


2631 


A 


10080 


620 


818 


VIYKLDSSLFSYFIYFFIFETESHFLPLMKWTG 
PIMAHCSLKILASRNSADSAFLSAGDTSLSHST 






A 




■a 
j 


1D4U 


Q a CUT© /~T\V L> A C/^T?\7/^T ADCCDUTf 17 DO AW 

NGTAIISLVRGPGILGEVTVFWRIFPPSVGEFA 
ETSGKLTMRDEQS A V IVVIQALNDD IPEEKSF 
YEFQLTAVSEGGVLSESSSTANITVVASDSPY 
GRFAFSHEQLRVSEAQRVNITIIRSSGDFGHVR 
LWYKTMSGTAEAGLDFVPAAGELLFEAGEM 
RKSLHVEILDDDYPEGPEEFSLTITKVELQGR 
GYDFTIQENGLQIDQPPEIGNISIVR1IIMKNDN 
AEGIIEFDPKYTAFEVEED VG LIMIP WRLHGT 
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YGYVTADFISQSSSASPGGVDYILHGSTVTFQ 

HGQNLSFINISIIDDNESEFEEPIEILLTGATGG 

AVLGRHLVSRIIIAKSDSPFGVIRFLNQSKISIA 

NPNSTMILSLVLERTGGLLGEIQVNWETVGPN 

SQEALLPQNRDIADPVSGLFYFGEGEGGVRT1I 

LTIYPHEE1EVEETF1IKLHLVKGEAKLDSRAK 

DVTLTIQEFGDPNGVVQFAPETLSKKTYSEPL 

ALEGPLLITFFVRRVKGTFGEIM 


1283 


2633 


A 


10088 


316 


516 


MGSKTLPAPVPIHPSLQLTNYSFLQAVNGLPT 
VPSDHLPNLYGFSALHAVHLHQWTLGYPAM 
HLXRS 


1284 


2634 


A 


10091 


2 


569 


FVSPSRAMASALIYVSKFKSFVILWTPLLLLP 

LVILMPAKFVRCA Y VIILMAIY WCTEVIPLAV 

TSLMPVLLFPLFQILDSRQVCVQYMKDTNML 

FLGGLIVAVAVERWNLHKRIALRTLLWVGA 

KPARLMLGFMGVTALLSMWISNTATTAMMV 

PIVEAILQQMEATSAATEAGLELVDKGKAKE 

LP 


1285 


2635 


A 


10092 


290 


728 


KQSTRPDVMTLYPLHWQEEMSGESVVSSAVP 

AAATRTTSFKGTSPSSKYVKLNVGGALYYTT 

MQTLTKQDTMLKAMFSGRMEVLTDSEGWIL 

IDRCGKHFGTILNYLRDGAVPLPESRREIEELL 

AEAKYYLVQGLVEECQAALQV 


1286 


2636 


A 


10100 


1 


574 


RPRGRGAWAGPGGDYSGVRRQQRRRTRISGS 

QRGSDAAGTMGCCTGRCSLICLCALQLVSAL 

ERQIFDFLGFQWAPILGNFLHIIVVILGLFGTIQ 

YRPRYIMVYTVWTALWVTWNVFIICFYLEVG 

GLSKBTDLNfTFNISVHRSWWREHGPGCVRR 

VLPPSAHGMMDDYTYVSVTGCIVDFQYLEVI 

HSA 


1287 


2637 


A 


10103 


252 


376 


RSRMGDKPIWEQIGSSFIQHYYQLFDNDRTQL 
GAIYVSFQL 


1288 


2638 


A 


10107 


1 


478 


MEEEDESRGKTEESGEDRGDGPPDRDPTLSPS 

AFILRAIQQAVGSSLQGDLPNDKDGSRCHGL 

RWRRCRSPRSEPRSQESGGTDTATVLDMATD 

SFLAGLVSVLDPPDTWVPSRLDLRPGESEDM 

LELVAEVRIGDRDPIPLPVPSLLPRLRAWRTG 

KT 


1289 


2639 


A 


10113 


237 


438 


LLSRMPSTNTRAGSLKDPEIAELFFKEDPEKLFT 

DLREIGHGSFGAAYFARDVRTNEWAHCKMS 

YSG 


1290 


2640 


A 


10114 


367 


856 


RGAKAKSAVLPPGPPCSSILILSPPAPLTPRSPG 

TEATRPTAMSKSLKKKSHWTSKVHESVIGRN 

PEGQLGFELKGGAENGQFPYLGEVKPGKVAY 

ESGSKLVSEELLLEVNETPVAGLTIRDVLAVI 

KHCKDPLRLKCVKQGESSGLLSVLPGGGTAR 

GAGQ 


1291 


2641 


A 


10116 


128 


591 


RTIRETERRSALSCSVLKSEPLPGLQPQASQQR 

RRRLPGRRQVQVQEGGGSGLRAWVLAMASV 

LGSGRGSGGLSSQLKCKSKRRRRRRSKRKDK 

VSILSTFLAPFKHLSPGITNTEDDDTLSTSSAE 

VKENRNVGNLAARPPPSGDRARGGATR 


1292 


2642 


A 


10121 


1 


749 


QRRRFRAGLWGGHGLTDGLRRNGGCGCSAR 

VPRVGERLRGHRCPDPLCLLLDMLFLSFHAG 

SWESWCCCCLIPADRPWDRGQHWQLEMADT 

RSVHETRFEAAVKVIQSLPKNGSFQPTNEMM 

LKFYSFYKQATEGPCKLSRPGFWDPIGRYKW 

DAWSSLGDMTTCEEAMlAYVEEMKKnETMP 

MTEKVEELLRVIGPFYEIVEDKKSGRSSDITSD 
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LGNVLTSTPNAKTVNGKAESSDSGAESEEEE 
AC 


1293 


2643 


A 


10124 


2 


989 


PLMSLVRVVEFVAASSAQKTPSRLENYYMVC 

KADEKFNQLVHFLRNHKQEKHLVFFRYSSGL 

CGRGIRDSARMCSTCACVEYYGKALEVLVK 

GVKIMCIHGKMKYKRNKIFMEFRKLQSGILV 

CTDVMARGIDIPEVNWVLQYDPPSNASAFVH 

RCGRTARIGHGGSALVFLLPMEESYINFLAIN 

QKCPLQEMKPQRNTADLLPKLKSMALADRA 

VFEKGMKAFVSYVQAYAKHECNLIFRLKDL 

DFASLARGFALLRMPKMPELRGKQFPDFVPV 

DVNTDTIPFKDKIREKQRQKLLEQQRREKTEN 

EGRRKF1KNKAWSKQKAKKK 


1294 


2644 


A 

t 


10129 


91 


1042 


VTMYKDCIESTGDYFLLCDAEGPWGHLESLA 

JLGlWTILLLIAi^FUvlRKUQDCSQVVW/LPTQ 

LLFLLSVLGLFGLAFAFIIELNQQTAPVRYFLF 

GVLFALCFSCLLAHASNLVKLVRGCVSFSWT 

TILCIA1GCSLLQ1HATEYVTLIMTRGMMFVN 

MTPCQLNVDFVVLLVYVLFLMALTFFVSKAT 

FCGPCENWKQHGRLlFlTVLFSinWWWISML 

LRGNPQFQRQPQWDDPWCIALVTNAWVFL 

LLYIVPELCILYRSCRQECPLQGNACPVTAYQ 

HSFQVENQELSRDKWKVLLNSDFLSHSGA 


1295 


2645 


A 


10133 


376 


518 


RPRWTHNSQWCFLPQDHPGWLPGQSGAPG 
GRGAPRQEGPGSSWRQV 


1296 


2646 


A 


10135 


3 


551 


EWSLDPFMGMSGQVGDLSPSQEKSLAQFRE 

NIQDVLSALPNPDDYFLLRWLQARSFDLQKS 

EDMLRKHMEFRKQQDLANILAWQPPEWRL 

YNANGICGHDGEGSPVWYHIVGSQDPKGLLL 

SASKQELLRDSFRSCELLLRECELQSQKLGKR 

VEKIIAIFGLEGLGLRDLWKPGI ELLQE 


1297 


2647 


A 


10138 


48 


407 


MVSSCCGSVCSDQGCGQDLCQETCCRPSCCE 
TTCCRTTCCRPSCCVSSCCRPQCCQSVCCQPT 
CSRPSCCQTTCCRTTCYRPSCCVSSCCRPQCC 
QPVCCQPTCCRPSCCETTCCHPXCC 


1298 


2648 


A 


10156 


94 


453 


GGNRKSAEMFSQVPRTPASGCYYLNSMTPEG 
QEMYLRFDQTTRRSPYRMSRILARHQLVTK1 
QQEIEAKEACDWLRAAGFPQYAQLYEDSQFP 
INIVAVKNDHDFLEKDLGEPLCRRLNT 


1299 


2649 


A 


10161 


1 


393 


PRFSELVDGRGRVSARFGGSPSKAATVRSQPT 

ASAQLENMEEAPKRVSLALQLPEHGSKDIGN 

VPGNCSENPCQNGGTCVPGADAHSCDCGPGF 

KGRRCELACIKVSRPCTRLFSETKAFPVWEGG 

VCHHV 


1300 


2650 


A 


10162 


98 


391 


AKIASLERIMPANYTCTRPDGDNTDFRYFIYA 
VTYTGILGPGLIGNILALWVFYGYMKETKRA 
VIFMINLAIADLLQVLSLPLRIFYYLKHDWPF 
VPV 


1301 


2651 


A 


10165 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTE 

KQLQCMPMEGRGRASSS1SDLQGKGFEKGTG 

EKHVPGVGSARHSPQASAGGSPWQRGKAQT 

RWLGKPDPGRKRRRGSPQEEGGLRVSAAAR 

LLCSGANRCKVLVRQNSTPNTQQPAVHPSTP 

PSRPLPQAGRCLVAPLRPHPDWVAAKTLAKA 

LRAPGKPWRLAAPSPLGDLGAPGLPGPSTAP 

RTLSVEEPGVECNQLCLYADVTDPVLCLGQK 

DPGVEGKHCEKEKISSSKELKHVHAKSEPSKP 

ARRLSESLHWDENKNESKIEREHKRRTSTPV 

IMEGVQEETDTRDVKRQVERSEICTEEPQKQ 
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KSTLKNEKHLKKDDSETPHLKSLLKKEVKSS 

KEKPEREKTPSEDKLSVKHKYKGDCMHKTG 

DETELHSSEKGLKVEENIQKQSQQTKLSSDDK 

TERKSKHRNERKLSVLGKDGKPVSEYIIKTDE 

NVRKENNKKERRLSAEKTKAEHKSRRSSDSK 

IQKDSLGSKQHGITLQRRSESYSEDKCDMDST 

NMDSNLKPEEVVHKEKRRTKSLLEEKLVLKS 

KSKTQGKQVKVVETELQEGATKQATTPKPD 

KEKNTEENDSEKQRKSKVEDKPFEETGVEPV 

LETASSSAHSTQKDSSHRAKLPLAKEKYKSD 

KX>STSTRLERKLSDGHKSRSLKHSSKDIKKKD 

ENKSDDKDGKEVDSSHEKARGNSSLMEKKL 

SRRLCENRRGSLSQEMAKGEEKLAANTLSTP 

SGSSLQRPKKSGDMTLIPEQEPMEIDSEPGVE 

NVFEVSKTQDNRNNNSHQDIDSENMKQKTS 

ATVQKDELRTCTADSKATAPAYKPGRGTGV 

NSNSEKHADHRSTLTKKMHIQSAVSKMNPGE 

KEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPS 

LSSVTVVPLRESYDPDVIPLFDKRTVLEGSTA 

STSPADHSALPNQSLTVRESEVLKTSDSKEGG 

EGFTVDTPAKASITSKRHIPEAHQATLLDGKQ 

GKVIMPLGSKLTGVIVENENITKEGGLVDMA 

KKENDLNAEPNLKQTIKATVENGKKDGIAVD 

HVVGLNTEKYAETVKLKHKRSPGKVKDISID 

VERRNENSEVDTSAGSGSAPSVLHQRNGQTE 

DVATGPRRAEKTSVATSTEGKDKDVTLSPVK 

AGPATTTSSETRQSEVALPCTSIEADEGLIIGT 

HSRNNPLHVGAEASECTVFAAAEEGGAWTE 

GFAESETFLTSTKEGESGECAVAESEDRAADL 

LAVHAVK1EANVNSVVTEEKDDAVTSAGSEE 

KCDGSLSRDSETVEGTIT7ISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEG 

TVTCTGAEGRSDNFVICSVTGAGPREERMVT 

GAGWLGDNDAPPGTSASQEGDGSVNDGTE 

GESAVTSTGITEDGEGPASCTGSEDSSEGFAIS 

SESEENGESAMDSTVAKEGTNVPLVAAGPCD 

DEGIVTSTGAKEEDEEGEDVVTSTGRGNEIGH 

ASTCTGLGEESEGVLICESAEGDSQIGTWEH 

VEAEAGAAIMNANENNVDSMSGTEKGSKDT 

DICSSAKGIVESSVTSAVSGKBEVTPVPGGCE 

GPMTSAASDQSDSQLEKVEDTTISTGLVGGS 

YDVLVSGEVPECEVAHTSPSEKEDEDHTSVE 

NEECDGLMATTASGDITNQNSLAGGKNQGK 

VLIISTSTTNDYTPQVSAITDVEGGLSDALRTE 

ENMEGTRVTTEEFEAPMPSAVSGDDSQLTAS 

RSEEKDECAMISTSIGEEFELPISSATTIKCAES 

LQPVAAAVEERATGPVL1STADFEGPMPSAPP 

EAESPLASTSKEEKDECALISTSIAEECEASVS 

GVVVESENERAGTVMEEKDGSGIISTSSVEDC 

EGPVS S A VPQEEGDPS VTPAEEMGDTAM1STS 

TSEGCEAVMIGAVLQDEDRLTITRVEDLSDA 

A1ISTSTAECMPISASIDRHEENQLTADNPEGN 

GDLSATEVSKHKVPMPSLIAENNCRCPGPVR 

GGKEPGPVLAVSTEEGHNGPSVHKPSAGQGH 

PSAVCAEKEEKHGKECPEIGPFAGRGQKESTL 

HLINAEEKNVLLNSLQKEDKSPETGTAGGSST 

ASYSAGRGLEGNANSPAHLRGPEQTSGQTAK 

DSSVS SIRYLAAVNTG AIKADDMPPVQGTVA 

EHSFLPAEQQGSEDNLKTSTTKCITGQESKIAP 
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M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
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/=possible nucleotide deletion, \~possibIe 
nucleotide insertion 














SHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVS 

SEENVCD1GNEESPLNVLGGLKLKANLKMEA 

YVPSEEEKNGEILAPPESLCGGKPSGIAELQRE 

PLLVNESLNVENSGFRTNEEIHSESYNKGEISS 

GRKDNAEAISGHSVEADPKEVEEEERHMPKR 

KRKQHYLSSEDEPDDNPDVLDSRIETAQRQC 

PETEPHATKEENSRDLEELPKTSSETNSTTSRV 

MEEKDEYSSSETTGEKPEQNDDDTIKS QE 


1302 


2652 


A 


10167 


321 


842 


EPSLFPFLRPSPARPPPRPPAPFPSPELAGPEPH 

F VF YFFLS Y VHPPKELAK YEYMEEQ VI LTEKG 

NSTVAGRGTSVRCLSPSPRPLPPLLPLLADLLE 

DGFGEHPFYHCLVAEVPKJEHWTPEGNPSPFP 

EARETKCYVRSSVGCVEPLTTQAEVTENLDR 

KNSQQVFKLLKKK 


1303 


2653 


A 


10171 


206 


429 


NMILLKKRRLLINSLGEGTINGLLDELLETNV 
LSQEDTEIVKCENVTVIDKARDLLDSV1RKGA 
RACEICITYI 


1304 


2654 


A 


10184 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGF 

YHEAVVLFTQALKLNPQDHRLFGNRSFCHER 

LGQPAWALADAQVALTLRPGWPRGLFRLGK 

ALMGLQRFREAAAVFQETLRGGSQPDAAREL 

RSCLLHLTLQGQRGGICAPPLSPGALQPLPHA 

ELAPSGLPSLRCPRSTALRSPGLSPLLH 


1305 


2655 


A 


10194 


2 


394 


TDLLGRRFRVDGAAMAACEGRRSGALGSSQ 

SDFLTPPVGGAPWAVATTVVMYPPPPPPPHR 

DFISVTLSFGESYDNSKSWRRRSCWRKWKQL 

SRLQRNMILFLLAFLLFCGLLFYINLADHWKG 

IRNTCT 


1306 


2656 


A 


I0I95 


1 


410 


IPGSTISLEGPLSKWTNVMKGWQYRWFVLDY 

NAGLLSYYTSKDKMMRGSRRGCVRLRGAVI 

GIDDEDDSTFTITVDQKTFHFQARBADEREK 

WIHALEETILRHTLQLQVRVFTWFPDSSLVGA 

FFFWLVSGFFFK 


1307 


2657 


A 


10205 


85 


308 


QGLPSTMVKLGCSFSGKPGKDPGDQDGAAM 
DSVPLISPLDISQLQPPLPDQVVIKTQTEYQLS 
SPDQQNYTKSR 


1308 


2658 


A 


10214 


2 


453 


ECGGIRQPGPGPPPALASAPAATMNRVGGSPS 

AAANYLLCTNCRKVLRKDKRIRVSQPLTRGP 

SAFIPEKEWQANTVDERTNFLVEEYSTSGRL 

DNITQVMSLHTQYLESFLRSQFYMLRMDGPL 

PLPYRHYIAIMAAARHQCSYLINM 


1309 


2659 


A 


10233 


45 


421 


RGWPEQQSTGRPRDVARQPRCQKEEGRRLRP 

RALESRTFQGSERSRWGPPLESTKENVQCGH 

RPAFPNSSWLPFHERLQVQNGECPWQVSIQM 

SRKHLCGGSILHWWWVLTAAHCFRRTLLDM 

AV 


1310 


2660 


A 


10241 


243 


442 


AFQLFNAKCESAFLSKRNPLQRNWTVLYRRK 
HKKGQSAEIQKKRTRRAFKFQRAITGASLADI 
MAK 


1311 


2661 


A 


10261 


751 


176 


LPGADYGGGH1 -SI Jtf .FH1XLTSAAWVPDESQ 

VTLNSAICVLSTVLIMEFPDLGKHCSEKTCKQ 

LDFLPVKCDACKQDFCKDHFPYAAHKCPFAF 

QKDVHVPVCPLCNTP1PVKKGQIPDVWGDHI 

DRDCDSHPGKKKEK1FTYRCSKEGCKKKEML 

QMVCAQCHGNFCIQHRHPLDHSCRHGSRPTI 

KAG 


1312 


2662 


A 


10270 


3 


669 


STSSDEGSPSASTPMINKTGFKFSAEKPVIEVP 
SMTILDKKDGEQAKALFEKVRKFRAHVEDSD 
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D=Aspartic Acid, E=01utamic Acid, 
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/^^possible nucleotide deletion, ^possible 
nucleotide insertion 














LI YKL Y V VQTV 1KTAKF1FILC YT ANF VN Al SF 

EHVCKPKVEHLIGYEVFECTHNMAYMLKKL 

LISYISIICVYGFICLYTLFWLFRIPLKEYSFEKV 

REESSFSDIPDVKNDFAFLLHMVDQYDQLYS 

KRFGVFLSEVSENKLREISLNHEWTFEKL 


1313 


2663 


A 


10287 


1221 


266 


GAHRVLSPAQGAQPRLRSAASVEVSMVGQR 

VLLLVAFLLSGVLLSEAAKILTISTLGGSHYLL 

LDRVSQILQEHGHNVTMLHQSGKFLIPDIKEE 

EKSYQV1RWFSPEDHQKRIKKHFDSYIETALD 

GRKESEAEVKLME1FGTQCSYLLSRKDIMDSL 

KNENYDLVFVEAFDFCSFLIAEKLVKPFVAIL 

PTTFGSLDFGLPSPLSYVPVFPSLLTDHMDFW 

GRVKNFLMFFSFSRSQWDMQSTFDNTIKEHF 

PEGSRPVLSHLLLKAELWFVNSDCAFDFARPL 

LPNTVYIGGLMEKPIKPVPQVSEPSAFSLGFT 


1314 


2664 


A 


10288 

! 

: 


536 


1890 


NVQLAKFSSTLVFFFSCDADPSALAKYVLAL 

VKKDKSEKELKALCIDQLDVFLQKETQIFVEK 

LFDAVNTKSYLPPPEQPSSGSLKVEFFPPQEK 

DIKKEEITKEEEREKKFSRRLNH SPPQS S SRYR 

ENRSRDERKKDDRSRKRDYDRNPPRRDSYRD 

RYNRRRGRSRSYSRSRSRSWSKERLRERDRD 

RSRTRSRSRTRSRERDLVKPKYDLDRTDPLEN 

NYTPVSSVPSISSGHYPVPTLSSTITVIAPTHHG 

NNTTESWSEFHEDQVDHNSYVRPPMPKKRC 

RDYDEKGFCMRGDMCPFDHGSDPWVEDVN 

LPGMQPFPAQPPVVEGPPPPGLPPPPPILTPPPV 

NLRPPVPPPGPLPPSLPPVTGPPPPLPPLQPSG 

MDAPPNSATSSVPTVVTTGIHHQPPPAPPSLFT 

ADTYDTDGYNPEAPSITNTSRPMYRHRVHPR 

AKLG 


1315 


2665 


A 


10293 


447 


1331 


SHPLLSCPEKVSAKXRAAAEAAAEERRTRGA 

GSRGICAGLRSVAPGPEPLKQEEGRREWGSSI 

GTPSPCGSAQAAAAAAAEEATEKIPALRPALL 

WALLALWLCCATPAHALQCRDGYEPCVNEG 

MCVTYHNGTGYCKCPEGFLGEYCQHRDPCE 

KNRCQNGGTCVAQAMLGKATCRCASGFTGE 

DCQYSTSHPCFVSRPCLNGGTCHMLSRDTYE 

CTCQVGFTGRNPKCPGGNLNYQFNGIIVVYS 

GGSVPPSGTKTSKPAEHNAMGTGSKNFASGT 

LWVMVSGATSTSTSTL 


1316 


2666 


A 


10294 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTG 

YSLVQENGQRKYGGPPPGWDAAPPERGCEIFI 

GKLPRDLFEDELIPLCEKJGKIYEMRMMMDF 

NGNNRGYAFVTFSNKVEAKNAIKQLNNYEIR 

NGRLLGVCASVDNCRLFVGGIPKTKK 


1317 


2667 


A 


10301 


158 


1956 


LLKSCGVLLSGVCIPCEGKGPTVLVIQTAVPQ 

DRPTKSSMRSAAKPWNPATRAGGHGPDRVRP 

LPAASSGMKSSKSSTSLAFESRLSRLKRASSE 

DTLNKPGSTAASGVVRLKKTATAGAISELTES 

RLRSGTGAFTTTKRTGIPAPREFSVTVSRERSV 

PD r^DCXrDD V C\/CCDT'OCXI'TT>TTCyTX r UT DTDCTvn 

r KUr oINx'Klvo V obr I ooN i rl r 1 KrlLK 1 rb 1 KP 

KQENEGGEKAALESQVRELLAEAKAKDSEIN 

RLRSELKKYKEKRTLNAEGTDALGPNVDGTS 

VSPGDTEPMIRALEEKNKNFQKELSDLEEENR 

VLKEKLIYLEHSPNSEGAASHTGDSSCPTSITQ 

ESSFGSPTGNQLSSDIDEYKKNIHGNALRTSG 

SSSSDVTKASLSPDASDFEHITAETPSRPLSSTS 

NPFKSSKCSTAGSSPNSVSELSLASLTEKIQKM 

EENHHSTAEELQATLQELSDQQQMVQELTAE 
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/=possibIe nucleotide deletion, \=possible 
nucleotide insertion 














NEKLVDEKTILETSFHQHRERAEQLSQENEKL 
MNLLQERVKNEEPTTQEGKIIELEQKCTGILE 
QGRPEREKLLNIQQQLTC SLRKVEEENQG AL 
EMIKRLKEENEKLNEFLELERHNNNMMAKTL 
EECRVTLEGLKMENGSLKSHLQG 


1318 


2668 


A 


10303 


333 


879 


GECFIMAAVVQQNDLVFEFASNVMEDERQL 

GDPAIFPAVIVEHVPGADILNSYAGLACVEEP 

NDMITESSLDVAEEEIIDDDDDDITLTVEASCH 

DGDETIETIEAAEALLNMDSPGPMLDE1CRINN 

NIFSSPEDDMVVAPVTHVSVTLDGIPEVMETQ 

QVQEKYADSPGASSPEQPKRKKK 


1319 


2669 


A 


10322 


169 


654 


MEVRMSGSVAVTRAIAVPGLLLLLIIATALSL 

LIGAKSLPASVVLEAFSGTCQSADCTIVLDAR 

LPRTLAGLLAGGALGLAGALMQTLTRNPLAD 

PGLLGVNAGASFAIVLGAALFGYSSAQEQLA 

MAFAGALVASLIVAFTGSQGGGQLSPVRLTL 

AGVXL 


1320 


2670 


A 


10323 


441 


2 


KMNQVAW1GGGQTLGAFLCHGLAAEGYRV 
AWDIQSDKAANVAQEINAEYGESMAYGFG 
AD ATSEQSVLALSRG VDEIFGRVDLL VY S AGI 
AKAAFISDFQLGDFDRSLQVNLVGYFLCARE 
FSRLMIRDGIQGRIIQINSKSDE 


1321 


2671 


A 


10332 


1 


453 


RHRTAGPGSTISSRTDSASAPAARAMPCEYTY 

AKLTSDCSRPSLQWYTRAQSKMRRPRLLLKD 

ILKCTLLVFGVRILYILKLNYTTEECDMKNMH 

YVDPDHVKRAQKYAQQVLQKESPPKFAKTS 

MALLFEHRYSVDLLPFVQKAPTDSEA 


1322 


2672 


A 


10333 


25 


423 


EPSNGPWYSALGNEDDEILLLGKDIIGTFAAS 
ERKMRAHQVLTFLLLFVTTSGA SENASTSRGC 
GLDLLPQNVYLCDLDAIWGIVVEAVAGAGA 
L1TLLLMLILLGRLPFIKEKEKKSPAVLHFLFL 
LGTLG 


1323 


2673 


A 


10334 


52 


426 


SSLGNEDDEIL SLAKDITGMFVASHRKMRAH 
QVLTFLLLFVITSVASENASTSRGCGLDLLPQ 
YVSLCDLDAIWGIWEAAAGAGALITLLLMLI 
LLVRLPFFKEKEKKSPVGLHFLFLLGTLGP 


1324 


2674 


A 


10336 


1 


932 


ERLCFPCMQSKIYSYMSFNKCSGMRFPLQEE 

NSVTHHEVKCQGKPLAGIYRKREEKRNAGN 

AVRSAMKSEEQKIKDARKGPLVPFPNQKSEA 

AEPPKTPPSSCDSTNAAIAKQALKKPIKGKQA 

PRKKAQGKTQQNRKLTDFYPVRRSSRKSKAE 

LQSEERKRiDELIESGKEEGMKIDLlDGKGRG 

VIATKQFSRGDFWEYHGDLJE1TDAKKREAL 

YAQDPSTGCYMYYFQYLSKTYCVDATRETN 

RLGRLINHSKCGNCQTKLHDIDGVPHLILIAS 

RDIAAGEELLYDYGDRSKASIEAHPWLKH 


1325 


2675 


A 


10338 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWL 

RGVTATFGRPAEWPGYLSHLCGRSAAMDLG 

PMRKSYRGDREAFEETHLTSLDPVKQFAAWF 

EEAVQCPDIGEANAMCLATCTRDGKPSARML 

LLKGFGKDGFRFFTNFESRKGKELDSNPFASL 

VFYWEPLNRQVRVEGPVKKLPEEEAECYFHS 

RPKSSQIGAVVSHQSSVIPDREYLRKKNEELE 

QLYQDQEVPKPKSWGGYVLYPQVMEFWQG 

QTNRLHDRIVFRRGLPTGDSPLGPMTHRGEE 

DWLYERLAP 


1326 


2676 


A 


10344 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPV 
LLASLGVGLVTLLGLAVGSYLVRRSRRPQVT 
LLDPNEKYLLRLLDKTTVSHNTKRFRFALPTA 
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Y=Tyrosine, X-Unkno\vn, *=Stop codon, 
/=possible nucleotide deletion, \=possible 














HHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSD 

EDQGYVDLVIKVYLKGVHPKFPEGGKMSQY 

LDSLKVGDVVEFRGPSGLLTYTGKGHFNIQP 

NKKSPPEPRVAKKLGMIAGGTGITPMLQLIRA 

1LKVPEDPTQCFLLFANQTEKDIILREDLEELQ 

ARYPNRFKL WFTLDHPPKDWA Y SKGF VTAD 

MIREHLPAPGDDVLVLLCGPPPMVQLACHPN 

LDKLGYSQKMRFTY 


1327 


2677 


A 


10345 


1 


968 


LQSAGEGVTHVLILLESPARPVAAVTQVQRR 
RYHRLSDMSMLAERRRKQKWAVDPQNTAW 
oNUlJbKxUt^KMLblvMu W aKOKuLOAQfcQG 
ATDHIKVQVKNNHLGLGATINNEDNWIAHQ 
DDFNQLLAELNTCHGQEri'DSSDKKEKKSFS 

1 *3VXIP VUVKjfl'tm^/^frM CCO O VTTM 
LiiEJVolVi oJVINK VH YiVLNJr J JxOiSJJLooKoiV i UL 

DCIFGKRQSKKTPEGDASPSTPEENETTTTSAF 

TIQEYFAKRMAALKNKPQVPVPGSDISETQVE 

RKRGiCKRNKEATGKDVESYLQPlCAKRHTEG 

KPERAEAQERVAKKKSAPAEEQLRGPCWDQ 

SSKASAQDAGDHVQPA 


1328 


2678 


A 


10346 


173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGI 
CRMAFNGCCPDCKVPGDDCPLVWGQCSHCF 
HMHCILKWLHAQQVQQHCPMCRQEWKFKE 


1329 


2679 


A 


10351 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFL 

SMYLVTVLGNLLIILATISDSHLHTPMYFFLSN 

LSFADICVTSTTlPKMLiMNIQTQNKViTYIACL 

MQMYFFILFAGFENFLLSVMAYDRFVAICHP 

LHYMVIMNPHLCGLLVLASWTMSALYSLLQI 

LMVVRLSFCTALEIPHFFCELNQVIQLACSDSF 

LNHMVIYFTVALLGGGPLTGILYSYSKIISS1H 

AISSAQGKYKAFSTCASHLSVVSLFYGAILGV 

YLSSAATRNSHSSATASVMYTVVTPMLNPFI 

YSLRNKDIKRALGIHLLWGTMKGQFFKKCP 


!330 


2680 


A 


10352 


34 


2573 


1PFLKSCCCCCLFDFPPPPLDQVQEEECEVERV 

TEHGTPKPFRKEDSVAFGESQSEDEQFENDLE 

TDPPNWQQLVSREVLLGLKPCEIKRQEVINEL 

FYTERAHVRTLKVLDQVFYQRVSREGILSPSE 

LRKIFSNLEDILQLHIGLNEQMKAVRKRKETS 

VIDQIGEDLLTWFSGPGEEKLKHAAATFCSNQ 

PFALEMIKSRQfCKDSRFQTFVQDAESNPLCRR 

LQLKDIIPTQMQRLTKYPLLLDNIATYTEWPT 

EREKVKKAADHCRQILNYVNQAVKEAENKQ 

RLEDYQRRLDTSSLKLSEYPNVEELRNLDLTK 

RKMIHEGPLVWKVNRDKTIDLYTLLLEDILV 

LLQKQDDRLVLRCHSK1LASTADSKHTFSPV1 

KLSTVLVRQVATDNKALFVISMSDNGAQIYE 

LVAQTVSEKTVWQDLICRMAASVKEQSTKPI 

PLPQSTPGEGDNDEEDPSKLKEEQHGISVTGL 

QSPDRDLGLESTLISSKPQSHSLSTSGKSEVRD 

LFVAERQFAKEQHTDGTLKEVGEDYQIAIPDS 

HIPVSEERWALDALRNLGLLKQLLVQQLGLT 

t-ivo V y^ttu WV^llr rK IK1 Aov^txr %* 1 Do VH^)JNo.b 

NIKAYHSGEGHMPFRTGTGDIATCYSPRTSTE 

SFAPRDSVGLAPQDSQASNILVMDHMIMTPE 

MPTMEPE GGLDDSGEHFFDAREAHSDENPSE 

GDGAVNKEEKDVNLRISGNYLILDGYDPVQE 

SSTDEEVASSLTLQPMTGIPAVESTHQQQHSP 

QNTHSDGAISPFTPEFLVQQRWGAMEYSCFEI 

QSPSSCADSQSQIMEYIHK1EADLEHLKKVEE 

SYTILCQRLAGSALTDKHSDKS 
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I=lsoleucine, KHLysine, L-Leucine, 
M=Methionine, N^Asparagine, P=Proline, 
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T-Threonine, V=Valine T W^Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


1331 


2681 


A 


10353 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEG 

AAGQQPTAPDKSKETNKTDNTEAPVTKIELLP 

SYSTATLIDEPTEVDDPWNLPTLQDSGIKWSE 

RDTKGKILCFFQGIGRLILLLGFLYFFVCSLDIL 

SSAFQLVGGKMAGQFFSNSSIMSNPLLGLVIG 

VLVTVLVQSSSTSTSIVVSMVSSSLLTVRAA1P 

IIMGANIGTSITNT1VALMQVGDRSEFRRAFA 

GATVHDFFNWLSVLVLLPVEVATHYLEIITQL 

I VESFHFKNGED APDLL K VITKPFTKL1 VQLDK 

KVISQIAMNDEKAKNKSLVK1WCKTFINKTQ 

INVTWSTANCTSPSLC\\nT)GIQNWTMKNVT 

YKEN1AKCQHIFVNFHLPDLAVGT1LLILSLLV 

LCGCLIMIVKILGSVLICGQVATVIKKTINTDFP 

FPFAWLTGYLA1LVGAGMTFIVQSSSVFTSAL 

TPLIGIGVITIERAYPLTLGSNlGrrriAILAAL 

ASPGNALRSSLQ1ALCHFFFNISGILLWYPIPFT 

RLPIRMAKGLGNISAKYRWFAVFYLIIFFFLIP 

LTVFGLSLAGWRVLVGVGVPWFniLVLCLR 

LLQSRCPRVLPKKLQNWNFLPLWMRSLKPW 

DAWSKFTGCFQMRCCCCCRVCCRACCLLC 

GCPKCCRCSKCCEDLEEAQEGQDVPVKAPET 

FDNITISREAQGEVPASDSKTECTAL 


1332 


2682 


A 


10354 


30 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPR 

GSQGKLRRVLVPMSVKPSWGPGPSEGVTAVP 

TSDLGEIHNWTELLDLFNHTLSECHVELSQST 

fCRVVLFALYLAMFVVGLVENLLVICVNWRG 

SGRAGLMNLYILNMAIADLGIVLSLPVWMLE 

VTLDYTWLWGSFSCRFTHYFYFVNMYSSIFF 

LVCLSVDRYVTLTSASPSWQRYQHRVRRAM 

CAG1WVLSAIIPLPEWHIQLVEGPEPMCLFM 

APFETYSTWALAVALSTTILGFLLPFPLLTVFN 

VLTACRLRQPGQPKSRRHCLLLCAYVAVFV 

MCWLPYHVTLLLLTLHGTHISLHCHLVHLLY 

FFYDVIDCFSMLHCVINPILYNFLSPHFRGRLL 

NAVVHYLPKDQTKAGTCASSSSCSTQHSIIIT 

KGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


1333 


2683 


A 


10358 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQND 

LMGTAEDFADQFLRVTKQYLPHVARLCLIST 

FLEDGIRMWFQW SEQRDYIDTTWNCGYLL A 

SSFVFLNLLGQLTGCVLVLSRNFVQYACFGLF 

GIIALQTIAYSILWDLKFLMFNLALGGGLLLL 

LAESRSEGKSMFAGVPTMRESSPKQYMQLGG 

RVLLVLMFMTLLHFDASFFSIVQNiVGTALMI 

LVAIGFKTKLAALTLWWLFAINVYFNAFWT 

IPVYKPMHDFLKYDFFQTMSVIGGLLLWAL 

GPGGVSMDEKKKEW 


1334 


2684 


A 


10367 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFP 

ELPLPHVPGQESAKRRSARRFLIMSELTKELM 

ELVWGTKSSPGLSDTEFCRWTQGFVFSESEGS 

ALEQFEGGPCAVIAPVQAFLLKKLLFSSEKSS 

WRDCSQEEQKELLCHTLCDILESACCDHSGS 

YCLVSWLRGKTTEETASISGSPAESSCQVEHS 

SALAVEELGFERFHALIQKRSFRSLPELKDAV 

LDQ YSM WGNKFGVLLFL YSVLLTKGIENIKN 

EIEDASEPLIDPVYGHGSQSLINLLLTGHAVSN 

VWDGDRECSGMKLLGIHEQAAVGFLTLMEA 

LRYCKVGSYLKISKIPYLDCLASETHLTVFFA 

KDMALVAPEAPSEQARRVFQTYDPEDNGFIP 

DSLLEDVMKALDLVSDPEYINLMKNKLDPEG 
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D^Aspartic Acid. E=GIutamic Acid, 
F=Phenylalanine, 0=Glycine, H=Histidine, 
I^Iso leucine, K=Lysine, L-Leucine, 
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Y=Tyrosine, X-Unknown. *=Stop codon, 
^possible nucleotide deletion, V=possible 
nucleotide insertion 














LGIILLGPFLQEFFPDQGSSGPESFTVYHYNGL 
KQSNYNEKVMYVEGTAVVMGFEDPMLQTD 
DTPIKRCLQTKWPYIELLWJTDRSPSLN 


1335 


2685 


A 


10375 


82 


2929 


TRTKRRLGREKAMASPPRGWGCGELLLPFML 

LGTLCEPGSGQIRYSMPEELDKGSFVGNIAKD 

LGLEPQELAERGVRIVSRGRTQLFALNPRSGS 

LVTAGRIDREELCAQSPLCVVNFNILVENKM 

KIYGVEVEIIDINDNFPRFRDEELKVKVNENA 

AAGTRL VLPFARDADVG VNSLRS YQLS SNLH 

FSLDVVSGTDGQKYPELVLEQPLDREKETVH 

DLLLTALDGGDPVLSGTTHIRVTVLDANDNA 

PLFTPSEYSVSVPENIPVGTRLLMLTATDPDE 

GINGKLTYSFRNEEEKISETFQLDSNLGEISTL 

QSLDYEESRFYLMEVVAQDGGALVASAKVV 

VTVQDVNDNAPEVILTSLTSSISEDCLPGTVIA 

LFSVHDGDSGENGEIACSIPRNLPFKLEKSVD 

NYYHLLTTRDLDREETSDYNITLTVMDHGTP 

PLSTESHIPLICVADVNDNPPNFPQASYSTSVT 

ENNPRGVSIFSVTAHDPDSGDNARVTYSLAE 

DTFQGAPLSSYVSINSDTGVLYALRSFDYEQL 

RDLQLWVTASDSGNPPLSSNVSLSLFVJ..DQN 

DNTPEILYPALPTDGSTGVELAPRSAEPGYLV 

TKWAVDKDSGQNAWLSYRLLKASEPGLFA 

VGLHTGEVRTARALLDRDALKQSL WAVED 

HGQPPLSATFTVTVAVADRIPDILADLGSIKTP 

IDPEDLDLTLYLVVAVAAVSCVFLAFVIVLLV 

LRLRRWHKSRLLQAEGSRLAGVPASHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNY 

ADTLLSEESCEKSEPLLMSDKVDANKEERRV 

QQAPPNTDWRFSQAQRPGTSGSQNGDDTGT 

WPNNQFDTEMLQAMILASASEAADGSSTLGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYR 

QNVYIPGSNATLTNAAGKRDGKAPAGGNGN 

KKKSGKKEKK 


1336 


2686 


A 


10379 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEK 
LAKLQAQ VRIGGKGTARRKKK VVHRTATA D 
DKKXQSSLKJCLAVNNIAGIEEVNMIKDDGTVI 
HFNNPKVQASLSANTFAITGHAEAKPITEMLP 
GILSQLGADSLTSLRKLAEQFPRQVLDSKAPK 
PEDIDEEDDDVPDLVENFDEASKNEAN 


1337 


2687 


A 


10380 


1 


1263 


IPGSTISWSPAAARGLSVCRCCRLHPASAMDL 

FGDLPEPERSPRPAAGKEAQKGPLLFDDLPPA 

SSTDSGSGGPLLFDDLPPASSGDSGSLATSISQ 

MVKTEGKGAKRKTSEEEKNGSEELVEKKVC 

KASSV1FGLKGYVAERKGEREEMQDAHVILN 

DITEECRPPSSLITRVSYFAVFDGHGGIRASKF 

AAQNLHQNLIRKFPKGDVISVEKTVKRCLLD 

TTKHTDEEFLKQA SSQKPA WKDG STATC VLA 

VDNILYIANLGDSRAILCRYNEESQKHAALSL 

SKEHNPTQYEERMRIQKAGGNVRDGRVLGV 

LEVSRSIGDGQYKRCGVTSVPDIRRCQLTPND 

RFILLACDGLFKVFTPEEAVNFILSCLEDEKIQ 

TREG K S AAD ARYEAACNRL AN KAVQRG SAD 

NVTVMVVRIGH 


1338 


2688 


A 


10385 


3 


589 


G PSQSMAAGELEGGKPLSG LLN ALAQDTFHG 

YPGITEELLRSQLYPEVPPEEFRPFLAKMRGIL 

KSIASADMDFNQLEAFLTAQTKKQGGITSDQ 

AAVISKFWKSHKTKIRESLMNQSRWNSGLRG 

LSWRVDGKSQSRH SAQIHTPV AI IELELGK Y G 

QESEFLCLEFDEVKVNQILKTLSEVEESISTLIS 
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Q=Glutamine, R=Arginine, S=Serine, 
1 — i nrcul j me, v — valine, vv — uypiu^/naijj 
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/^possible nucleotide deletion, N^possible 
nucleotide insertion 














QPN 


1339 


2689 


A 


10386 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDG 
KCVICDSYVRPCTLVRICDECNYGSYQGRCVI 
CGGPGVSDAYYCKECTIQEKDRBGCPKJVNL 
GSSKTDLFYERKKYGFKKR 


1340 


2690 


A 


10388 


113 


3472 


SQLRKGASATHSSPSRTDCIAQMMDIYVCLK 

RPSWMVDNKRMRTASNFQWLLSTFILLYLM 

NQVNSQKKGAPHDLKCVTNNLQVWNCSWK 

APSGTGRGTDYEVCIENRSRSCYQLEKTSIKIP 

ALSHGDYEITINSLHDFGSSTSKFTLNEQNVSL 

IPDTPEILNLSADFSTSTLYLKWNDRGSVFPHR 

SNV1WE1KVLRKESMELVKLVTHNTTLNGKD 

TLHH W S WASDMPLECAIHFVEIRCYIDNLHFS 

GLEEWSDWSPVKNISWIPDSQTKVFPQDKVIL 

VGSDITFCCVSQEKVLSALIGHTNCPLIHLDGE 

NVAIKIRNISVSASSGTNWFriEDNIFGTVIF 

AGYPPDTPQQLNCETHDLKEIICSWNPGRVTA 

LVGPRATSYTLVESFSGKYVRLKRAEAPTNES 

YQLLFQMLPNQEIYNFTLNAHNPLGRSQSTIL 

VNITEKVYPHTPTSFKVKDINSTAVKLSWHLP 

GNFAKINFLCEIEIKKSNSVQEQRNVTIKGVE 

NSSYLVALDKLNPYTLYTFRIRCSTETFWKW 

SKWSNKKQHLTTEASPSKGPDTWREWSSDG 

KNLIIYWKPLPINEANGKILSYNVSCSSDEETQ 

SLSEIPDPQHKAEIRLDKNDY1ISWAKNSVGS 

SPPSKIASMEIPNDDLKIEQWGMGKGILLTW 

HYDPNMTCDYVIKWCNSSRSEPCLMDWRKV 

PSNSTETVIESDEFRPGIRYNFFLYGCRNQGY 

QLLRSMIGYEEELAPIVAPNFTVEDTSADSILV 

KWEDIP VEELRGFLRGYLFY FGKGERDTSKM 

RVLESGRSDDCVKNITDISQKTLRIADLQGKTS 

YHLVLRAYTDGGVGPEKSMYWTKENSVGL 

IIAlLIPVAVAVIVGVVTSILCYRKJUiW 

PDIPNPENCKALQFQKSVCEGSSALKTLEMNP 

CTPNNVEVLETRSAFPKIEDTEIVSPVAERPEN 

RSDAKPENHWESYCPPHEEEIPNPAADETGG 

TAQVIYIDVQSMYQPQAKPEEEQENDPVGGA 

GYKPQMHLPINSTVEDIAAEEDLDKTAGYRP 

QANVNTWNLVSPDSPRSIDSNSEIVSFGSPCSI 

NSRQFLIPPKDEDSPKSNGGGWSFTNFFQNKP 

ND 


1341 


2691 


A 


10392 


1 


5057 


MLPPKHLSATKPKKSWAPNLYELDSDLTKEP 

DVIIGEGPTDSEFFHQRFRNLIYVEFVGPRKTL 

IKLRNLCLDWLQPETRTKEEHELLVLEQYLTn 

PEKLKPWVRAKKPENCEKLVTLLENYKEMY 

QPEGESLHGVLWSAGLRCPLGLSASTLLTW 

SGLDNSLSWAAVGMSCVLWDIELHHDFLGV 

ATKSVSTHAQGDAAQGLGGTIVRMWARDSN 

LATGVLLDDNNSDVTSDDDMTRNRRESSPPH 

SVHSFSGDRDWDRRGRSRDTEPRDRWSHTR 

NPRSRMPPRDLSLPVVAKTSFEMDREDDRDS 

T> A VP RPQfin APSYONVVm A POT? VPT-TNTTOn 
tV/V I JIoIVjjI^L'/VDO I V^IN V V UL, Anj_JKJVr X11N 1 1\£LJ 

NMENYRKLL SLG VQLAEDDGHSHMTQGHSS 

RSKRSAYPSTSRGLKTMPEAKKSTHRRGICED 

ESSHGVIMEKFIKDVSRSSKSGRARESSDRSQ 

RFPRMSDDNWKDISLNKRESVIQQRVYEGNA 

FRGGFRFNSTLVSRKRVLERKRRYHFDTDGK 

GSIHDQKGCPRKKPFECGSEMRKAMSVSSLS 

SLSSPSFTESQPIDFGAMPYVCDECGRSFSV1S 

EFVEHQIMHTRENLYEYGESFIHSVAVSEVQK 



331 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 

09/496 

914 


Predicted 

beginning 
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r^AsDartic Acid F— Glutamic Arid 
F=PhenylaJanine, G=Grycine, H-Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=ProIine> 
Q=Glutamine, R=Arginine, S=Serine, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 








t 
t 

r 
t 

1 
i 






SQVGGKRFECKDCGETFNKSAALAEHRK1HA 

RGYLVECKNQECEEAFMPSPTFSELQKIYGK 

DKFYECRVCKETFLHSSALIEHQKIHFGDDKD 

NEREHERERERERGETFRPSPALNEFQKMYG 

KEKMYECKVCGETFLHSSSLKEHQKIHTRGN 

PFENKGKVCEETF1PGQSLKRRQKTYNKEKLC 

DFTDGRDAFMQSSELSEHQKIHSRXNLFEGR 

GYEKSVIHSGPFTESQKSHTITRPLESDEDEKA 

FTISSNPYENQKIPTKENVYEAKSYERSVIHSL 

ASVEAQKSHSVAGPSKPKVMAESTIQSFDAIN 

HQRVRAGGNTSEGREYSRSVIHSLVASKPPRS 

HNGNELVESNEKGESSIYISDLNDKRQKIPAR 

ENPCEGGSKNRNYEDSVIQSVFRAKPQKSVP 

GEGSGEFKKX^GEFSVPSSWREYQKARAKKK 

YIEHRSNETSVIHSLPFGEQTFRPRGMLYECQ 

ECGECFAHSSDLTEHQKIHDREKPSGSRNYE 

WSVIRSLAPTDPQTSYAQEQYAKEQARNKCK 

DFRQFFATSEDLNTNQKIYDQEKSHGEESQGE 

NTDGEETHSEETHGQETIEDPVIQGSDMEDPQ 

KDDPDDKIYECEDCGLGFVDLTDLTDHQKVH 

SRKCLVDSREYTHSVIHTHSISEYQRDYTGEQ 

LYECPKC GESFIHSSFL FEHQRJHEQDQL YS M 

KGCDDGFIALLPMKPRRNRAAERNPA LAG S A 

IRCLLCGQGFIHSSALNEHMRLHREDDLLEQS 

QMAEEAIIPGLALTEFQRSQTEERLFECAVCG 

ESFVNPAELADHVTVHKNEPYEYGSSYTHTS 

FI TFPT K'f^ArPFVFfTfnr'fTK^FTWQT'VT TKWYT? 
i 1j i i-si iji\Unirr I H/V^rvL'v_ VJrvor lilo 1 VLJ Mi\r, 

LHLEEEEEDEAAAAAAAAAQEVEANVHVPQ 

WLRIQGLNVEAAEPEVEAAEPEVEAAEPEV 

EAAEPNGEAEGPDGEAAEPIGEAGQPNGE AE 

QFNGDADEPDGAGIEDPEERAEEPEGKAEEPE 

GDADEPDGVGIEDPEEGEDQEIQVEEPYYDC 

HECTETFTSSTAFSEHLKTHASMIIFEPANAFG 

ECSGYIERASTSTGGANQADEKYFKCDVCGQ 

LFNDHLSLARHQNTHTG 


1342 


2692 


A 


10393 


2 


U50 


GRPRSSSDNRNFLRERAGLSSAAVQTOIGNSA 

ASRRSPAARPPVPAPPALPRGRPGTEGSTSLS 

APAVLWAVAWVVWSAVAWAMANYIHV 

PPGSPEVPKLNVTVQDQEEHRCREGALSLLQ 

HLRPHWDPQEVTLQLFTDGITNKLIGCYVGN 

TMEDWLVRIYGNKTELLVDRDEEVKSFRVL 
OAJJGCAPOLYCTFNNGI CYFFIOOFAT DPT<fW 

VCNPAIFRLIARQLAKIHAIHAHNGWIPKSNL 

WLKMGKYFSUPTGFADEDINKRFLSD1PSSQI 

LQEEMTWMKEILSNLGSPWLCHNDLLCKNH 

YNEKQGDVQFIDYEYSGYNYLAYDIGNHFNE 

FAGVSDVDYSLYPDRELQSQWLRAYLEAYK 

EFKGFGTEVTEKEVEILFIQVNQFALASHFFW 

GLWAXIQAKYSTIEFDFLGYAIVRFNQYFKM 

KPEVTALKVPE 


1343 


2693 


A 


10394 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASA 
QDARYGQKDSSDQNFDYMFKXLUGNSSVGK 
TSFLFRYADDSFTSAFVSTVGIDFKVKTVFICN 
EKRIKLQIWDTAGQERYR1 1 II AYYRGAMGFI 
LMYDITNEESFNAVQDWSTQIKTYSWDNAQ 
VILVGNKCDMEDERVISTERGQHLGEQLGFE 
FFETSAKDNIN VKQTFERL VDII CDKMSESLET 
DPAITAAKQNTRLKETPPPPQPNCAC 


1344 


2694 


A 


10395 


2 


4136 


DRPPWNSRVDDF VTNLIHLSSKGHI SPAKDTS 
LQQRTPAEMSPVLHFYVRPSGHEGAASGHTR 
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/=possible nucleotide deletion, V=possible 
nucleotide insertion 






1 








RKLQGKLPELQGVETELCYNVNWTAEALPSA 

EETKKLMWLFGCPLLLDDVARES WLLPG SN 

DLLLEVGPRLNFSTPTSTN I VS VCRATGLGP V 

DRVETTRRYRLSFAHPPSAEVEAIALATLHDR 

MTEQHFPHPIQSFSPESMPEPLNGPINILGEGR 

LALEKANQELGLALDSWDLDFYTKRFQELQR 

NPSTVEAFDLAQSNSEHSRHWFFKGQLHVDG 

QKLVHSLFESIMSTQESSNPNNVLKFCDNSSA 

IQGKEVRFLRPEDPTRPSRFQQQQGLRHVVFT 

AETHNFPTGVCPFSGATTGTGGRIRDVQCTG 

RGAHWAGTAGYCFGNLHIPGYNLPWEDLSF 

QYPGNFARPLEVAIEASNGASDYGNKFGEPV 

LAGFARSLGLQLPDGQRREW1KPIMFSGGIGS 

MEADHISKEAPEPGMEWKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQ 

KMNR VIRAC VE APKGNPICSLHDQGAG GNG 

NVLKELSDPAGAIIYTSRFQLGDPTLNALEIW 

GAEYQESNALLLRSPNRDFLTHVSARERCPA 

CFVGTITGDRRIVLVDDRECPVRRNGQGDAP 

PTPPPTPVDLELEWVLGKMPRKEFFLQRKPP 

MLQPLALPPGLSVHQALERVLRLPAVASKRY 

LTNKVDRSVGGLVAQQQCVGPLQTPLADVA 

WALSHEELIGAATALGEQPVKSLLDPKVAA 

RLAVAEALTNLVFALVTDLRDVKCSGNWM 

WAAKLPGEGAALADACEAMVAVMAALGVA 

VDGGKDSLSMAARVGTETVRAPGSLVISAYA 

VCPDITATVTPDLKHPEGRGHLLYVALSPGQ 

HRJLGGTALAQCFSQLGEHPPDLDLPENLVRA 

FSITQGLLKDRIXCSGHDVSDGGLVTCLLEM 

AFAGNCGLQVDVPVPRVDVLSVLFAEEPGLV 

LEVQEPDLAQVLKRYRDAGLHCLELGHTGE 

AGPHAMVRVSVNGAVVLEEPVGELRALWEE 

TSFQLDRLQAEPRCVAEEERGLRERMGPSYC 

LPPTFPKASVPREPGGPSPRVAILREEGSNGDR 

EMADAFHLAGFEV WD VTMQDLCSGAI GLDT 

FRGVAFV GGFS YADVLGSAKG WAAAVTFHP 

RAGAELRRFRKRPDTFSLGVCNGCQLLALLG 

WVGGDPNEDAAEMGPDSQPARPGLLLRHNL 

SGRYESRWASVRVGPGPALMLRGMEGAVLP 

VWSAHGEGYVAFSSPELQAQIEARGLAPLHW 

ADDDGNPTEQYPLNPNGSPGGVAGICSCDGR 

HLAVMPHPERAVIU* WQWAWKrrrrDl H 1 1> 

PWLQLFINARNWTLEGSC 


1345 


2695 


A 


10396 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRE 

RVAMHYQMSVTLKYEIKKLIYVHLVIWLLLV 

AKMSVGHLRLLSHDQVAMPYQWEYPYLLSI 

LPSLLGLLSFPRNNISYLVLSMISMGLFSIAPLI 

YGSMEMFPAAQQLYRHGKAYRFLFGFSAVSI 

MYLVLVLAVQVHAWQLYYSKKLLDSWFTST 

QEKKHK 


1346 


2696 


A 


10398 


1 


718 


DDFVRCGPQSAAMGASARLLRAVIMGAPGS 

EIGVLAKAFIDQGKLIPDDVMTRLALHELKNL 

TQYSWLLDGFPRTLPQAEALDRAYQIDTVINL 

NVPFEVIKQRLTARWIHPASGRVYNIEFNPPK 

TVGIDDLTGEPLIQREDDKPETVIKRLKAYED 

QTKPVLEYYQKKGVLETFSGTETNKIWPYVY 

AFLQTKVPQRSQKASVTP 


1347 


2697 


A 


10402 


153 


1969 


KHRQENNALDMAPEIHMTGPMCLIENTNGEL 
VANPEALKJLSAITQPWVVAIVGLYRTGKSY 
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T}=Acnartir AriH F^GIntamir AriH 

F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














LMNKLAGKNKGFSLGSTVKSHTKGIWMWCV 

PHPKKPEHTLVLLDTEGLGDVKKGDNQNDS 

WIFTLAVLLSSTLVYNSMGTINQQAMDQLYY 

VTELTHR1RSKSSPDENENEDSADFVSFFPDFV 

WTLRDFSLDLEADGQPLTPDEYLEYSLKLTQ 

GTSQKDKNFNLPRLCIRKFKPKKKCFVFDLPI 

HRRKLAQLEKLQDEELDPEFVQQVADFCSYI 

FSNSKTKTLSGGIKVNGPRLESLVLTYINAISR 

GDLPCMENAVLALAQIENSAAVQKAIAHYD 
OOMGOKVOI PAFTT OFT T DT HRV<;FT?F atfv 

YMKNSFKDVDHLFQKXLAAQLDKKRDDFCK 

QNQEASSDRCSALLQVIFSPLEEEVKAGIYSK 

PGGYCLFIQKLQDLEKKYYEEPRKGIQAEEIL 

QTYLKSKESVTDAILQTDQILTEKEKEIEVEC 

VKAESAQASAKMVEEMQIICYQQMMEEKEKS 

YQEHVKQLTEKMERERAQLLEEQEKTLTSKL 

QEQARVLKERCQGESTQLQNEIQKLQKTLKK 

KTKRYMSHKLKJ 


1348 


2698 


A 


10404 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAG 

VAGTrAPRTJRTPVTMWRl I AT? A Q APT t r>\/T>T c 

DSWALLPASAGVKTLLPVPSFEDVSIPEKPKL 

RFIERAPLVPKVRREPKNLSDIRGPSTEATEFT 

EGNFAILALGGGYLHWGHFEMMRLTINRSM 

DPKNMFAIWRVPAPFKPITRKSVGHRMGGGK 

GAIDHYVTPVKAGRLVVEMGGRCEFEEVQG 

FLDQVAHKLPFAAKAVSRGTLEKMRKDQEE 

RERNNQNPWTFER1ATANMLGIRKVLSPYDL 

THKGKYWGKFYMPKRV 


1349 


2699 


A 


10409 


59 


1184 


LRRNCSALGGLFQTIISDMKGSYPVWEDFINK 
AGKXQSQLRTTWAAAAFLDAFQKVADMAT 
NTRGGTREIGSALTRMCMRHRSIEAKLRQFSS 

EYKKARQEIKJCKSSDTLKLQKJCAXKGRGDIQ 

PQLDSALQDVNDKYLLLEETEKQAVRKALIE 

ERGRFCTFISMLRPVIEEEISMLGEITHLQTISE 

DLKSLTMDPHKLPSSSEQVILDLKGSDYSWS 

YQTPPSSPSTTMSRKSSVCSSLNSVNSSDSRSS 

GSHSHSPSSHYRYRSSNLAQQAPVRLSSVSSH 

DSGFISQDAFQSKSPSPMPPEAPNQRRKEKRE 

PDPNGGGPT1 ASGPPAAAEEAQRPRSM 


1350 


2700 


A 


10410 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKST 

RGHSSLLPPSQDFVAGLSVILRGTVDDRLNW 

AFNLYDLNKDGCITKEEMLDIMKSIYDMMG 

KYTYPALREEAPREHVESFFQKMDRNKDGV 

VTIEEFIESCQKDENIMRSMQLFDNVI 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-1350, a mature protein codbg portion of SEQ ID NO: 1-1350, an 
active domain of SEQ ID NO: 1-1350, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1 -1 3 50. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

1 2. An antibody directed against the polypeptide of claim 1 0. 
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13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting Hie polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 



15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

1 7. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 1 0 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 
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a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified 

19. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected from 
the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1350, a mature protein 
coding portion of SEQ ID NO: 1-1350, an active domain of SEQ ID NO: 1-1350, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-1350, under conditions sufficient to express the polypeptide in said 
cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO: 135 1-2700, the mature protein portion thereof, or the active domain 
thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprises the sequence 
information of at least one of SEQ ID NO: 1-1350. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

i 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 
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i 

27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 1 0 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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Pages 340 to 1963 of this application contain amino acid sequence listings. 
They can be obtained at the address given below. 

Les pages 340 to 1963 de cette demande contiennent des listages des sequences 
d'acides amin6s. Elles peuvent etre obtenues a I'adresse indiqu6e ci-dessous. 



World Intellectual Property Organization 
34, chemin des Colombettes 
CH-1211 Gen6ve 20 



